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ANTI -VIRAL VECTORS 

Field of the Invention 

5 The present invention relates to novel viral vectors capable of delivering anti-viral 
inhibitory RNA molecules to target cells. 

Background to the Invention 

10 The application of gene therapy to the treatment of AIDS and HIV infection has been 
discussed widely (Lever, 1995). The types of therapeutic gene proposed usually fall into 
one of two broad categories. In the first the gene encodes protein products that inhibit the 
virus in a number of possible ways. One example of such a protein is the RevMlO 
derivative of the HIV Rev protein. The RevMlO protein acts as a transdominant negative 

15 mutant and so competitively inhibits Rev function in the virus. Like many of the protein- 
based strategies, the RevMlO protein is a derivative of a native HIV protein. While this 
provides the basis for the anti-HIV effect, it also has serious disadvantages. In particular, 
this type of strategy demands that in the absence of the virus there is little or no expression 
of the gene. Otherwise, healthy cells harbouring the gene become a target for the host 

20 cytotoxic T lymphocyte (CTL) system, which recognises the foreign protein. The second 
broad category of therapeutic gene circumvents these CTL problems. The therapeutic gene 
encodes inhibitory RNA molecules; RNA is not a target for CTL recognition. 

There are several types of inhibitory RNA molecules known: anti-sense RNA, ribozymes, 
25 competitive decoys and external guide sequences (EGSs). 

External guide sequences, first identified by Forster and Altman (1990), are RNA 
sequences that are capable of directing the cellular protein RNase P to cleave a particular 
RNA sequence. In vivo, they are found as part of precursor tRNAs where they function to 
30 direct cleavage by the cellular riboprotein RNase P in vivo of the tRNA precursor to form 
mature tRNA. However, in principle, any RNA can be targeted by a custom-designed EGS 
RNA for specific cleavage by RNase P in vitro or in vivo. For example, Yuan et al. (1992) 
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demonstrate a reduction in the levels of chloramphenicol activity in cells in tissue culture 
as a result of introducing an appropriately designed EGS. 

In recent years a number of laboratories have developed retroviral vector systems based on 
5 HIV. In the context of anti-HIV gene therapy these vectors have a number of advantages 
over the more conventional murine based vectors such as murine leukaemia virus (MLV) 
vectors. Firstly, HIV vectors would target precisely those cells that are susceptible to HIV 
infection. Secondly, the HIV-based vector would transduce cells such as macrophages that 
are normally refractory to transduction by murine vectors. Thirdly, the anti-HIV vector 
10 genome would be propagated through the CD4+ cell population by any virus (HIV) that 
escaped the therapeutic strategy. This is because the vector genome has the packaging 
signal that will be recognised by the viral particle packaging system. These various 
attributes make HIV-vectors a powerful tool in the field of anti-HIV gene therapy. 

15 A combination of inhibitory RNA molecules and an HIV-based vector would be attractive 
as a therapeutic strategy. However, until now this has not been possible. Vector particle 
production takes place in producer cells which express the packaging components of the 
particles and package the vector genome. The inhibitory RNA sequences that are designed 
to destroy the viral RNA would therefore also interrupt the expression of the components 

20 of the HIV-based vector system during vector production. The present invention aims to 
overcome this problem. 

Summary of the Invention 

25 It is therefore an object of the invention to provide a system and method for producing viral 
particles, in particular HIV particles, which carry nucleotide constructs encoding inhibitory 
RNA molecules such as external guide sequences, optionally together with other classes of 
inhibitory RNA molecules such as ribozymes and/or antisense RNAs directed against a 
corresponding virus, such as HIV, within a target cell, that overcomes the above-mentioned 

30 problems. The system includes both a viral genome encoding the inhibitory RNA 
molcules and nucleotide constructs encoding the components required for packaging the 
viral genome in a producer cell. However, in contrast to the prior art, although the 
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packaging components have substantially the same amino acid sequence as the 
corresponding components of the target virus, the inhibitory RNA molecules do not affect 
production of the viral particles in the producer cells because the nucleotide sequence of 
the packaging components used in the viral system have been modified to prevent the 
inhibitory RNA molecules from effecting cleavage or degradation of the RNA transcripts 
produced from the constructs. Such a viral particle may be used to treat viral infections, in 
particular HIV infections. 

Accordingly the present invention provides a viral vector system comprising: 

(i) a first nucleotide sequence encoding an external guide sequence capable of binding 
to and effecting the cleavage by RNase P of a second nucleotide sequence, or transcription 
product thereof, encoding a viral polypeptide required for the assembly of viral particles; 
and 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of viral particles, which third nucleotide sequence has a different nucleotide 
sequence to the second nucleotide sequence such that the third nucleotide sequence, or 
transcription product thereof, is resistant to cleavage directed by the external guide 
sequence. 

Preferably, said system further comprises at least one further first nucleotide sequence 
encoding a gene product capable of binding to and effecting the cleavage, directly or 
indirectly, of a second nucleotide sequence, or transcription product thereof, encoding a 
viral polypeptide required for the assembly of viral particles, wherein the gene product is 
selected from an external guide sequence, a ribozyme and an anti-sense ribonucleic acid. 

In another aspect, the present invention provides a viral vector production system 
comprising: 

(i) a viral genome comprising at least one first nucleotide sequence encoding a gene 
product capable of binding to and effecting the cleavage, directly or indirectly, of a second 
nucleotide sequence, or transcription product thereof, encoding a viral polypeptide required 
for the assembly of viral particles; 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
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assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third 
nucleotide sequence, or transcription product thereof, is resistant to cleavage directed by 
said gene product; 

5 wherein at least one of the gene products is an external guide sequence capable of binding 
to and effecting the cleavage by RNase P of the second nucleotide sequence. 

Preferably, in addition to an external guide sequence, at least one gene product is selected 
from a ribozyme and an anti-sense ribonucleic acid, preferably a ribozyme. 

10 

Preferably, the viral vector is a retroviral vector, more preferably a lentiviral vector, such as 
an HIV vector. The second nucleotide sequence and the third nucleotide sequences are 
typically from the same viral species, more preferably from the same viral strain. 
Generally, the viral genome is also from the same viral species, more preferably from the 
1 5 same viral strain. 

In the case of retroviral vectors, the polypeptide required for the assembly of viral particles 
is selected from gag, pol and env proteins. Preferably at least the gag and pol sequences 
are lentiviral sequences, more preferably HIV sequences. Alternatively, or in addition, the 
20 env sequence is a lentiviral sequence, more preferably an HIV sequence. 

In a preferred embodiment, the third nucleotide sequence is resistant to cleavage directed 
by the gene product as a result of one or more conservative alterations in the nucleotide 
sequence which remove cleavage sites recognised by the at least one gene product and/or 
25 binding sites for the at least one gene product. For example, where the gene product is an 
EGS, the third nucleotide sequence is adapted to prevent EGS binding and/or to remove the 
RNase P consensus cleavage site. Alternatively, where the gene product is a ribozyme, the 
third nucleotide sequence is adapted to be resistant to cleavage by the ribozyme. 

30 Preferably the third nucleotide sequence is codon optimised for expression in host cells. 
The host cells, which term includes producer cells and packaging cells, are typically 
mammalian cells. 
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In a particularly preferred embodiment, (i) the viral genome is an HIV genome comprising 
nucleotide sequences encoding anti-HIV EGSs and optionally anti-HIV ribozyme 
sequences directed against HIV packaging component sequences (such as gag.pol) in a 
5 target HIV and (ii) the viral system for producing packaged HIV particles further 
comprises nucleotide constructs encoding the same packaging components (such as gag.pol 
proteins) as in the target HIV wherein the sequence of the nucleotide constructs is different 
from that found in the target HIV so that the anti-HIV EGS and anti-HIV ribozyme 
sequences cannot effect cleavage or degradation of the gag.pol transcripts during 
10 production of the HIV particles in producer cells. 

The present invention also provides a viral particle comprising a viral vector according to 
the present invention and one or more polypeptides encoded by the third nucleotide 
sequences according to the present invention. For example the present invention provides 
15 a viral particle produced using the viral vector production system of the invention. 

In another aspect, the present invention provides a method for producing a viral particle 
which method comprises introducing into a host cell (i) a viral genome vector according to 
the present invention; (ii) one or more third nucleotide sequences according to the present 
20 invention; and (iii) nucleotide sequences encoding the other essential viral packaging 
components not encoded by the one or more third nucleotide sequences. 

The present invention further provides a viral particle produced using by the method of the 
invention. 

25 

The present invention also provides a pharmaceutical composition comprising a viral 
particle according to the present invention together with a pharmaceutically acceptable 
carrier or diluent. 

30 The viral system of the invention or viral particles of the invention may be used to treat 
viral infections, particularly retroviral infections such as lentiviral infections including HIV 
infections. Thus the present invention provides a method of treating a viral infection which 




. -6- P006478GB ATM 

method comprises administering to a human or animal patient suffering from the viral 
infection an effective amount of a viral system, viral particle or pharmaceutical 
composition of the present invention. 

5 The invention relates in particular to HIV-based vectors carrying anti-HIV EGSs. 
However, the invention can be applied to any other virus, in particular any other lentivirus, 
for which treatment by gene therapy may be desirable. The invention is illustrated herein 
for HIV, but this is not considered to limit the scope of the invention to HIV-based anti- 
HIV vectors. 

10 

Detailed Description of the Invention 

The term "viral vector" refers to a nucleotide construct comprising a viral genome capable 
of being transcribed in a host cell, which genome comprises sufficient viral genetic 

15 information to allow packaging of the viral RNA genome, in the presence of packaging 
components, into a viral particle capable of infecting a target cell. Infection of the target 
cell includes reverse transcription and integration into the target cell genome, where 
appropriate for particular viruses. The viral vector in use typically carries heterologous 
coding sequences (nucleotides of interest) which are to be delivered by the vector to the 

20 target cell, for example a first nucleotide sequence encoding an EGS. A viral vector is 
incapable of independent replication to produce infectious viral particles within the final 
target cell. 

The term " viral vector system" is intended to mean a kit of parts which can be used when 
25 combined with other necessary components for viral particle production to produce viral 
particles in host cells. For example, the first nucleotide sequence may typically be present 
in a plasmid vector construct suitable for cloning the first nucleotide sequence into a viral 
genome vector construct. When combined in a kit with a third nucleotide sequence, which 
will also typically be present in a separate plasmid vector construct, the resulting 
30 combination of plasmid containing the first nucleotide sequence and plasmid containing 
the third nucleotide sequence comprises the essential elements of the invention. Such a kit 
may then be used by the skilled person in the production of suitable viral vector genome 
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constructs which when transfected into a host cell together with the plasmid containing the 
third nucleotide sequence, and optionally nucleic acid constructs encoding other 
components required for viral assembly, will lead to the production of infectious viral 
particles. 

Alternatively, the third nucleotide sequence may be stably present within a packaging cell 
line that is included in the kit. 

The kit may include the other components needed to produce viral particles, such as host 
cells and other plasmids encoding essential viral polypeptides required for viral assembly. 
By way of example, the kit may contain (i) a plasmid containing a first nucleotide sequence 
encoding an anti-HIV EGS and (ii) a plasmid containing a third nucleotide sequence 
encoding a modified HIV gag.pol construct which cannot be cleaved by the anti-HIV 
ribozyme. Optional components would then be (a) an HIV viral genome construct with 
suitable restriction enzyme recognition sites for cloning the first nucleotide sequence into 
the viral genome; (b) a plasmid encoding a VSV-G env protein. Alternatively, nucleotide 
sequence encoding viral polypeptides required for assembly of viral particles may be 
provided in the kit as packaging cell lines comprising the nucleotide sequences, for 
example a VSV-G expressing cell line. 

The term "viral vector production system" refers to the viral vector system described above 
wherein the first nucleotide sequence has already been inserted into a suitable viral vector 
genome. 

Viral vectors are typically retroviral vectors, in particular lentiviral vectors such as HIV 
vectors. The retroviral vector of the present invention may be derived from or may be 
derivable from any suitable retrovirus. A large number of different retroviruses have been 
identified. Examples include: murine leukemia virus (MLV), human immunodeficiency 
virus (HIV), simian immunodeficiency virus, human T-cell leukemia virus (HTLV). 
equine infectious anaemia virus (EIAV), mouse mammary tumour virus (MMTV), Rous 
sarcoma virus (RSV), Fujinami sarcoma virus (FuSV), Moloney murine leukemia virus 
(Mo-MLV), FBR murine osteosarcoma virus (FBR MSV), Moloney murine sarcoma virus 
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(Mo-MSV), Abelson murine leukemia virus (A-MLV), Avian myelocytomatosis virus-29 
(MC29), and Avian erythroblastosis virus (AEV). A detailed list of retroviruses may be 
found in Coffin et aL, 1997, "Retroviruses", Cold Spring Harbour Laboratory Press Eds: 
JM Coffin, SM Hughes, HE Varmus pp 758-763. 

Details on the genomic structure of some retroviruses may be found in the art. By way of 
example, details on HIV and Mo-MLV may be found from the NCBI Genbank (Genome 
Accession Nos. AF033819 and AF03381 1, respectively). 

The lentivirus group can be split even further into "primate" and "non-primate". Examples 
of primate lentiviruses include human immunodeficiency virus (HIV), the causative agent 
of human auto-immunodeficiency syndrome (AIDS), and simian immunodeficiency virus 
(SIV). The non-primate lentiviral group includes the prototype "slow virus" visna/maedi 
virus (VMV), as well as the related caprine arthritis-encephalitis virus (CAEV), equine 
infectious anaemia virus (EIAV) and the more recently described feline immunodeficiency 
virus (FIV) and bovine immunodeficiency virus (BIV). 

The basic structure of a retrovirus genome is a 5' LTR and a 3' LTR, between or within 
which are located a packaging signal to enable the genome to be packaged, a primer 
binding site, integration sites to enable integration into a host cell genome and gag, pol and 
env genes encoding the packaging components - these are polypeptides required for the 
assembly of viral particles. More complex retroviruses have additional features, such as 
rev and RRE sequences in HIV, which enable the efficient export of RNA transcripts of the 
integrated provirus from the nucleus to the cytoplasm of an infected target cell. 

In the provirus, these genes are flanked at both ends by regions called long terminal repeats 
(LTRs). The LTRs are responsible for pro viral integration, and transcription. LTRs also 
serve as enhancer-promoter sequences and can control the expression of the viral genes. 
Encapsidation of the retroviral RNAs occurs by virtue of a psi sequence located at the 5' 
end of the viral genome. 
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The LTRs themselves are identical sequences that can be divided into three elements, 
which are called U3, R and U5. U3 is derived from the sequence unique to the 3' end of 
the RNA. R is derived from a sequence repeated at both ends of the RNA and U5 is 
derived from the sequence unique to the 5 ' end of the RNA. The sizes of the three 
elements can vary considerably among different retroviruses. 

In a defective retroviral vector genome gag, pol and env may be absent or not functional. 
The R regions at both ends of the RNA are repeated sequences. U5 and U3 represent 
unique sequences at the 5 5 and 3' ends of the RNA genome respectively. 

In a typical retroviral vector for use in gene therapy, at least part of one or more of the gag, 
pol and env protein coding regions essential for replication may be removed from the virus. 
This makes the retroviral vector replication-defective. The removed portions may even be 
replaced by a nucleotide sequence of interest (NOI), such as a first nucleotide sequence of 
the invention, to generate a virus capable of integrating its genome into a host genome but 
wherein the modified viral genome is unable to propagate itself due to a lack of structural 
proteins. When integrated in the host genome, expression of the NOI occurs - resulting in, 
for example, a therapeutic and/or a diagnostic effect. Thus, the transfer of an NOI into a 
site of interest is typically achieved by: integrating the NOI into the recombinant viral 
vector; packaging the modified viral vector into a virion coat; and allowing transduction of 
a site of interest - such as a targeted cell or a targeted cell population. 

A minimal retroviral genome for use in the present invention will therefore comprise (5') R 
- U5 - one or more first nucleotide sequences - U3-R (3'). However, the plasmid vector 
used to produce the retroviral genome within a host cell/packaging cell will also include 
transcriptional regulatory control sequences operably linked to the retroviral genome to 
direct transcription of the genome in a host cell/packaging cell. These regulatory 
sequences may be the natural sequences associated with the transcribed retroviral sequence, 
i.e. the 5' U3 region, or they may be a heterologous promoter such as another viral 
promoter, for example the CMV promoter. 
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Some retroviral genomes require additional sequences for efficient virus production. For 
example, in the case of HIV, rev and RRE sequence are preferably included. However the 
requirement for rev and RRE can be reduced or eliminated by codon optimisation. 

Once the retroviral vector genome is integrated into the genome of its target cell as proviral 
DNA, the ribozyme sequences need to be expressed. In a retrovirus, the promoter is 
located in the 5' LTR U3 region of the provirus. In retroviral vectors, the promoter driving 
expression of a therapeutic gene may be the native retroviral promoter in the 5 ' U3 region, 
or an alternative promoter engineered into the vector. The alternative promoter may 
physically replace the 5' U3 promoter native to the retrovirus, or it may be incorporated at 
a different place within the vector genome such as between the LTRs. 

Thus, the first nucleotide sequence will also be operably linked to a transcriptional 
regulatory control sequence to allow transcription of the first nucleotide sequence to occur 
in the target cell. The control sequence will typically be active in mammalian cells. The 
control sequence may, for example, be a viral promoter such as the natural viral promoter 
or a CMV promoter or it may be a mammalian promoter. It is particularly preferred to use 
a promoter that is preferentially active in a particular cell type or tissue type in which the 
virus to be treated primarily infects. Thus, in one embodiment, a tissue-specific regulatory 
sequences may be used. The regulatory control sequences driving expression of the one or 
more first nucleotide sequences may be constitutive or regulated promoters. 

Replication-defective retroviral vectors are typically propagated, for example to prepare 
suitable titres of the retroviral vector for subsequent transduction, by using a combination 
of a packaging or helper cell line and the recombinant vector. That is to say, that the three 
packaging proteins can be provided in trans. 

A "packaging cell line" contains one or more of the retroviral gag, pol and env genes. The 
packaging cell line produces the proteins required for packaging retroviral DNA but it 
cannot bring about encapsidation due to the lack of a psi region. However, when a 
recombinant vector carrying an NOI and a psi region is introduced into the packaging cell 
line, the helper proteins can package the /7s/-positive recombinant vector to produce the 
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recombinant virus stock. This virus stock can be used to transduce cells to introduce the 
NOI into the genome of the target cells. It is preferred to use a psi packaging signal, called 
psi plus, that contains additional sequences spanning from upstream of the splice donor to 
downstream of the gag start codon (Bender et ah. 1987) since this has been shown to 
increase viral titres. 

The recombinant virus whose genome lacks all genes required to make viral proteins can 
tranduce only once and cannot propagate. These viral vectors which are only capable of a 
single round of transduction of target cells are known as replication defective vectors. 
Hence, the NOI is introduced into the host/target cell genome without the generation of 
potentially harmful retrovirus. A summary of the available packaging lines is presented in 
Coffin etaL, 1997 (ibid). 

Retroviral packaging cell lines in which the gag, pol and env viral coding regions are 
carried on separate expression plasmids that are independently transfected into a packaging 
cell line are preferably used. This strategy, sometimes referred to as the three plasmid 
transfection method (Soneoka et al. 9 1995) reduces the potential for production of a 
replication-competent virus since three recombinant events are required for wild type viral 
production. As recombination is greatly facilitated by homology, reducing or eliminating 
homology between the genomes of the vector and the helper can also be used to reduce the 
problem of replication-competent helper virus production. 

An alternative to stably transfected packaging cell lines is to use transiently transfected cell 
lines. Transient transfections may advantageously be used to measure levels of vector 
production when vectors are being developed. In this regard, transient transfection avoids 
the longer time required to generate stable vector-producing cell lines and may also be used 
if the vector or retroviral packaging components are toxic to cells. Components typically 
used to generate retroviral vectors include a plasmid encoding the gag/pol proteins, a 
plasmid encoding the env protein and a plasmid containing an NOI. Vector production 
involves transient transfection of one or more of these components into cells containing the 
other required components. If the vector encodes toxic genes or genes that interfere with 
the replication of the host cell, such as inhibitors of the cell cycle or genes that induce 
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apotosis, it may be difficult to generate stable vector-producing cell lines, but transient 
transfection can be used to produce the vector before the cells die. Also, cell lines have 
been developed using transient transfection that produce vector titre levels that are 
comparable to the levels obtained from stable vector-producing cell lines (Pear et al. 9 
1993). 

Producer cells/packaging cells can be of any suitable cell type. Most commonly, 
mammalian producer cells are used but other cells, such as insect cells are not excluded. 
Clearly, the producer cells will need to be capable of efficiently translating the env and 
gag, pol mRNA. Many suitable producer/packaging cell lines are known in the art. The 
skilled person is also capable of making suitable packaging cell lines by, for example 
stably introducing a nucleotide construct encoding a packaging component into a cell line. 

As will be discussed below, where the retroviral genome encodes an inhibitory RNA 
molecule capable of effecting the cleavage of gag, pol and/or env RNA transcripts, the 
nucleotide sequences present in the packaging cell line, either integrated or carried on 
plasmids, or in the transiently transfected producer cell line, which encode gag, pol and or 
env proteins will be modified so as to reduce or prevent binding of the inhibitory RNA 
molecule(s). In this way, the inhibitory RNA molecule(s) will not prevent expression of 
components in packaging cell lines that are essential for packaging of viral particles. 

It is highly desirable to use high-titre virus preparations in both experimental and practical 
applications. Techniques for increasing viral titre include using a psi plus packaging signal 
as discussed above and concentration of viral stocks. In addition, the use of different 
envelope proteins, such as the G protein from vesicular-stomatitis virus has improved titres 
following concentration to 10 9 per ml (Cosset et aL, 1995). However, typically the 
envelope protein will be chosen such that the viral particle will preferentially infect cells 
that are infected with the virus which it desired to treat. For example where an HIV vector 
is being used to treat HIV infection, the env protein used will be the HIV env protein. 

Suitable first nucleotide sequences for use according to the present invention encode gene 
products that result in the cleavage and/or enzymatic degradation of a target nucleotide 



') -13- P006478GBATM 

sequence, which will generally be a ribonucleotide. As particular examples, EGSs, 
ribozymes, and antisense sequences may be mentioned, more specifically EGSs. 

External guide sequences (EGSs) are RNA sequences that bind to a complementary target 
sequence to form a loop in the target RNA sequence, the overall structure being a substrate 
for RNaseP-mediated cleavage of the target RNA sequence. The structure that forms when 
the EGS anneals to the target RNA is very similar to that found in a tRNA precursor. The 
the natural activity of RNaseP can be directed to cleave a target RNA by designing a 
suitable EGS. The general rules for EGS design are as follows, with reference to the 
generic EGSs shown in Figure 9B: 

Rules for EGS design in mammalian cells (see Figure 9B} 

Target sequence - All tRNA precursor molecules have a G immediately 3' of the RNaseP 
cleavage site (i.e. the G forms a base pair with the C at the top of the acceptor stem prior to 
the ACCA sequence). In addition a U is found 8 nucleotides downstream in all tRNAs. 
(i.e. G at position 1, U at position 8). A pyrimidine may be preferred 5' of the cut site. No 
other specific target sequences are required. 

EGS sequence - A 7 nucleotide 'acceptor stem' analogue is optimal (5' hybridising arm). 
A 4 nucleotide 'D-stem' analogue is preferred (3' hybridising arm). Variation in this 
length may alter the reaction kinetics. This will be specific to each target site. A consensus 
c T-stem and loop' analogue is essential. Minimal 5' and 3' non-pairing sequences are 
preferred to reduce the potential for undesired folding of the EGS RNA. 

Deletion of the 'anti-codon stem and loop' analogue may be beneficial. Deletion of the 
variable loop can also be tolerated in vitro but an optimal replacement loop for the deletion 
of both has not been defined in vivo. 

As with ribozymes, described below, it is preferred to use more than one EGS. Preferably, 
a plurality of EGSs is employed, together capable of cleaving gag, pol and env RNA of the 
native retrovirus at a plurality of sites. Since HIV exists as a population of quasispecies, 
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not all of the target sequences for the EGSs will be included in all HIV variants. The 
problem presented by this variability can be overcome by using multiple EGs. Multiple 
EGSs can be included in series in a single vector and can function independently when 
expressed as a single RNA sequence. A single RNA containing two or more EGSs having 
different target recognition sites may be referred to as a multitarget EGS. 

Further guidance may be obtained by reference to, for example, Werner et al. (1997); 
Werner et al (1998); Ma et al (1998) and Kawa et al (1998). 

Ribozymes are RNA enzymes which cleave RNA at specific sites. Ribozymes can be 
engineered so as to be specific for any chosen sequence containing a ribozyme cleavage 
site. Thus, ribozymes can be engineered which have chosen recognition sites in transcribed 
viral sequences. By way of an example, ribozymes encoded by the first nucleotide 
sequence recognise and cleave essential elements of viral genomes required for the 
production of viral particles, such as packaging components. Thus, for retroviral genomes, 
such essential elements include the gag, pol and env gene products. A suitable ribozyme 
capable of recognising at least one of the gag, pol and env gene sequences, or more 
typically, the RNA sequences transcribed from these genes, is able to bind to and cleave 
such a sequence. This will reduce or prevent production of the gal, pol or env protein as 
appropriate and thus reduce or prevent the production of retroviral particles. 

Ribozymes come in several forms, including hammerhead, hairpin and hepatitis delta 
antigenomic ribozymes. Preferred for use herein are hammerhead ribozymes, in part 
because of their relatively small size, because the sequence requirements for their target 
cleavage site are minimal and because they have been well characterised. The ribozymes 
most commonly used in research at present are hammerhead and hairpin ribozymes. 

Each individual ribozyme has a motif which recognises and binds to a recognition site in 
the target RNA. This motif takes the form of one or more "binding arms", generally two 
binding arms. The binding arms in hammerhead ribozymes are the flanking sequences 
Helix I and Helix III, which flank Helix II. These can be of variable length, usually 
between 6 to 1 0 nucleotides each, but can be shorter or longer. The length of the flanking 
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sequences can affect the rate of cleavage. For example, it has been found that reducing the 
total number of nucleotides in the flanking sequences from 20 to 12 can increase the 
turnover rate of the ribozyme cleaving a HIV sequence, by 10-fold (Goodchild et al. 9 
1991). A catalytic motif in the ribozyme Helix II in hammerhead ribozymes cleaves the 
target RNA at a site which is referred to as the cleavage site. Whether or not a ribozyme 
will cleave any given RNA is determined by the presence or absence of a recognition site 
for the ribozyme containing an appropriate cleavage site. 

Each type of ribozyme recognises its own cleavage site. The hammerhead ribozyme 
cleavage site has the nucleotide base triplet GUX directly upstream where G is guanine, U 
is uracil and X is any nucleotide base. Hairpin ribozymes have a cleavage site of 
BCUGNYR, where B is any nucleotide base other than adenine, N is any nucleotide, Y is 
cytosine or thymine and R is guanine or adenine. Cleavage by hairpin ribozymes takes 
places between the G and the N in the cleavage site. 

The nucleic acid sequences encoding the packaging components (the "third nucleotide 
sequences") may be resistant to the ribozyme or ribozymes because they lack any cleavage 
sites for the ribozyme or ribozymes. This prohibits enzymatic activity by the ribozyme or 
ribozymes and therefore there is no effective recognition site for the ribozyme or 
ribozymes. Alternatively or additionally, the potential recognition sites may be altered in 
the flanking sequences which form the part of the recognition site to which the ribozyme 
binds. This either eliminates binding of the ribozyme motif to the recognition site, or 
reduces binding capability enough to destabilise any ribozyme-target complex and thus 
reduce the specificity and catalytic activity of the ribozyme. Where the flanking sequences 
only are altered, they are preferably altered such that catalytic activity of the ribozyme at 
the altered target sequence is negligible and is effectively eliminated. 

Preferably, a series of several anti-HIV ribozymes is employed in the invention. These can 
be any anti-HIV ribozymes but must include one or more which cleave the RNA that is 
required for the expression of gag, pol or env. Preferably, a plurality of ribozymes is 
employed, together capable of cleaving gag, pol and env RNA of the native retrovirus at a 
plurality of sites. Since HIV exists as a population of quasispecies, not all of the target 
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sequences for the ribozymes will be included in all HIV variants. The problem presented 
by this variability can be overcome by using multiple ribozymes. Multiple ribozymes can 
be included in series in a single vector and can function independently when expressed as a 
single RNA sequence. A single RNA containing two or more ribozymes having different 
5 target recognition sites may be referred to as a multitarget ribozyme. The placement of 
ribozymes in series has been demonstrated to enhance cleavage. The use of a plurality of 
ribozymes is not limited to treating HIV infection but may be used in relation to other 
viruses, retroviruses or otherwise. 



10 Antisense technology is well known on the art. There are various mechanisms by which 
antisense sequences are believed to inhibit gene expression. One mechanism by which 
antisense sequences are believed to function is the recruitment of the cellular protein 
RNaseH to the target sequence/antisense construct heteroduplex which results in cleavage 
and degradation of the heteroduplex. Thus the antisense construct, by contrast to 

15 ribozymes, can be said to lead indirectly to cleavage/degradation of the target sequence. 
Thus according to the present invention, a first nucleotide sequence may encode an 
antisense RNA that binds to either a gene encoding an essential/packaging component or 
the RNA transcribed from said gene such that expression of the gene is inhibited, for 
example as a result of RNaseH degradation of a resulting heteroduplex. It is not necessary 

20 for the antisense construct to encode the entire complementary sequence of the gene 
encoding an essential/packaging component - a portion may suffice. The skilled person 
will easily be able to determine how to design a suitable antisense construct. 



By contrast, the nucleic acid sequences encoding the essential/packaging components of 
25 the viral particles required for the assembly of viral particles in the host cells/producer 
cells/packaging cells (the third nucleotide sequences) are resistant to the inhibitory RNA 
molecules encoded by the first nucleotide sequence. For example in the case of ribozymes, 
resistance is typically by virtue of alterations in the sequences which eliminate the 
ribozyme recognition sites. At the same time, the amino acid coding sequence for the 
30 essential/packaging components is retained so that the viral components encoded by the 
sequences remain the same, or at least sufficiently similar that the function of the 
essential/packaging components is not compromised. 
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The term "viral polypeptide required for the assembly of viral particles" means a 
polypeptide normally encoded by the viral genome to be packaged into viral particles, in 
the absence of which the viral genome cannot be packaged. For example, in the context of 
retroviruses such polypeptides would include gag, pol and env. The terms "packaging 
component" and "essential component" are also included within this definition. 

In the case of antisense sequences, the third nucleotide sequence differs from the second 
nucleotide sequence encoding the target viral packaging component antisense sequence to 
the extent that although the antisense sequence can bind to the second nucleotide sequence, 
or transcript thereof, the antisense sequence can not bind effectively to the third nucleotide 
sequence or RNA transcribed from therefrom. The changes between the second and third 
nucleotide sequences will typically be conservative changes, although a small number of 
amino acid changes may be tolerated provided that, as described above, the function of the 
essential/packaging components is not significantly impaired. 

Preferably, in addition to eliminating the inhibitory RNA recognition sites, the alterations 
to the coding sequences for the viral components improve the sequences for codon usage in 
the mammalian cells or other cells which are to act as the producer cells for retroviral 
vector particle production. This improvement in codon usage is referred to as "codon 
optimisation". Many viruses, including HIV and other Antiviruses, use a large number of 
rare codons and by changing these to correspond to commonly used mammalian codons, 
increased expression of the packaging components in mammalian producer cells can be 
achieved. Codon usage tables are known in the art for mammalian cells, as well as for a 
variety of other organisms. 

Thus preferably, the sequences encoding the packaging components are codon optimised. 
More preferably, the sequences are codon optimised in their entirety. Following codon 
optimisation, it is found that there are numerous sites in the wild type gag, pol and env 
sequences which can serve as inhibitory RNA recognition sites and which are no longer 
present in the sequences encoding the packaging components. In an alternative but less 
practical strategy, the sequences encoding the packaging components can be altered by 
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targeted conservative alterations so as to render them resistant to selected inhibitory RNAs 
capable of effecting the cleaviage of the wild type sequences. 

An additional advantage of codon optimising HIV packaging components is that this can 
5 increase gene expression. In particular, it can render gag, pol expression Rev independent 
so that rev and RRE need not be included in the genome (Haas et al. 9 1996). Rev- 
independent vectors are therefore possible. This in turn enables the use of anti-rev or RRE 
factors in the retroviral vector. 

10 As described above, the packaging components for a retroviral vector include expression 
products of gag, pol and env genes. In accordance with the present invention, gag and pol 
employed in the packaging system are derived from the target retrovirus on which the 
vector genome is based. Thus, in the RNA transcript form, gag and pol would normally be 
cleavable by the ribozymes present in the vector genome. The env gene employed in the 

15 packaging system may be derived from a different virus, including other retroviruses such 
as MLV and non-retroviruses such as VSV (a Rhabdovirus), in which case it may not need 
any sequence alteration to render it resistant to cleavage effected by the inhibitory RNA(s). 
Alternatively, env may be derived from the same retrovirus as gag and pol, in which case 
any recognition sites for the inhibitor} 7 RNA(s) will need to be eliminated by sequence 

20 alteration. 

The process of producing a retroviral vector in which the envelope protein is not the native 
envelope of the retrovirus is known as "pseudotyping". Certain envelope proteins, such as 
MLV envelope protein and vesicular stomatitis virus G (VSV-G) protein, pseudotype 
25 retroviruses very well. Pseudotyping can be useful for altering the target cell range of the 
retrovirus. Alternatively, to maintain target cell specificity for target cells infected with the 
particular virus it is desired to treat, the envelope protein may be the same as that of the 
target virus, for example HIV. 

30 Other therapeutic coding sequences may be present along with the first nucleotide 
sequence or sequences. Other therapeutic coding sequences include, but are not limited to, 
sequences encoding cytokines, hormones, antibodies, immunoglobulin fusion proteins, 
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enzymes, immune co-stimulatory molecules, anti-sense RNA, a transdominant negative 
mutant of a target protein, a toxin, a conditional toxin, an antigen, a single chain antibody, 
tumour suppresser protein and growth factors. When included, such coding sequences are 
operatively linked to a suitable promoter, which may be the promoter driving expression of 
the first nucleotide sequence or a different promoter or promoters. 

Thus the invention comprises two components. The first is a genome construction that will 
be packaged by viral packaging components and which carries a series of anti-viral 
inhibitory RNA molecules such as anti-HIVEGs. These could be any anti-HIV EGSs but 
the key issue for this invention is that some of them result in cleavage of RNA that is 
required for the expression of native or wild type HIV gag, pol or env coding sequences. 
The second component is the packaging system which comprises a cassette for the 
expression of HIV gag, pol and a cassette either for HIV env or an envelope gene encoding 
a pseudotyping envelope protein - the packaging system beig resistant to the inhibitory 
RNA molecules. 

The viral particles of the present invention, and the viral vector system and methods used 
to produce may thus be used to treat or prevent viral infections, preferably retroviral 
infections, in particular lentiviral, especially HIV, infections. Specifically, the viral 
particles of the invention, typically produced using the viral vector system of the present 
invention may be used to deliver inhibitory RNA molecules to a human or animal in need 
of treatment for a viral infection. 

Alternatively, or in addition, the viral production system may be used to transfect cells 
obtained from a patient ex vivo and then returned to the patient. Patient cells transfected ex 
vivo may be formulated as a pharmaceutical composition (see below) prior to 
readminstration to the patient. 

Preferably the viral particles are combined with a pharmaceutically acceptable carrier or 
diluent to produce a pharmaceutical composition. Thus, the present invention also provides 
a pharmaceutical composition for treating an individual, wherein the composition 
comprises a therapeutically effective amount of the viral particle of the present invention, 
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together with a pharmaceutical^ acceptable carrier, diluent, excipient or adjuvant. The 
pharmaceutical composition may be for human or animal usage. 

The choice of pharmaceutical carrier, excipient or diluent can be selected with regard to the 
intended route of administration and standard pharmaceutical practice. Suitable carriers and 
diluents include isotonic saline solutions, for example phosphate-buffered saline. The 
pharmaceutical compositions may comprise as - or in addition to - the carrier, excipient or 
diluent any suitable binder(s), lubricant(s), suspending agent(s), coating agent(s), 
solubilising agent(s), and other carrier agents that may aid or increase the viral entry into 
the target site (such as for example a lipid delivery system). 

The pharmaceutical composition may be formulated for parenteral, intramuscular, 
intravenous, intracranial, subcutaneous, intraocular or transdermal administration. 

Where appropriate, the pharmaceutical compositions can be administered by any one or 
more of: inhalation, in the form of a suppository or pessary, topically in the form of a 
lotion, solution, cream, ointment or dusting powder, by use of a skin patch, orally in the 
form of tablets containing excipients such as starch or lactose, or in capsules or ovules 
either alone or in admixture with excipients, or in the form of elixirs, solutions or 
suspensions containing flavouring or colouring agents, or they can be injected parenterally, 
for example intracavernosally, intravenously, intramuscularly or subcutaneously. For 
parenteral administration, the compositions may be best used in the form of a sterile 
aqueous solution which may contain other substances, for example enough salts or 
monosaccharides to make the solution isotonic with blood. For buccal or sublingual 
administration the compositions may be administered in the form of tablets or lozenges 
which can be formulated in a conventional manner. 

The amount of virus administered is typically in the range of from 10 3 to 10 10 pfu, 

5 8 6 7 

preferably from 10 to 10 pfu, more preferably from 10 to 10 pfu. When injected, 
typically 1-10 jal of virus in a pharmaceutical^ acceptable suitable carrier or diluent is 
administered. 
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When the polynucleotide/vector is administered as a naked nucleic acid, the amount of 
nucleic acid administered is typically in the range of from 1 ng to 10 mg, preferably from 
100 [xg to 1 mg. 

Where the first nucleotide sequence (or other therapeutic sequence) is under the control of 
an inducible regulatory sequence, it may only be necessary to induce gene expression for 
the duration of the treatment. Once the condition has been treated, the inducer is removed 
and expression of the NOI is stopped. This will clearly have clinical advantages. Such a 
system may, for example, involve administering the antibiotic tetracycline, to activate gene 
expression via its effect on the tet repressor/VP 1 6 fusion protein. 

The invention will now be further described by way of Examples, which are meant to serve 
to assist one of ordinary skill in the art in carrying out the invention and are not intended in 
any way to limit the scope of the invention. The Examples refer to the Figures. In the 
Figures: 

Figure 1 shows schematically ribozymes inserted into four different HIV vectors; 

Figure 2 shows schematically how to create a suitable 3' LTR by PCR; 

Figure 3 shows the codon usage table for wild type HIV gag.pol of strain HXB2 (accession 
number: K03455). 

Figure 4 shows the codon usage table of the codon optimised sequence designated gag,pol- 
SYNgp. 

Figure 5 shows the codon usage table of the wild type HIV env called env-mn. 

Figure 6 shows the codon usage table of the codon optimised sequence of HIV env 
designated SYNgpl60mn. 

Figure 7 shows three plasmid constructs for use in the invention. 
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Figure 8 shows the principle behind two systems for producing retroviral vector particles. 
Figure 9 A shows an EGS based on tyrosyl t-RNA 

5 

Figure 9B shows a consensus EGS sequence. 

Figure 10 shows twelve different anti-HIV EGS constructs. 

10 Figure 1 1 is a schematic representation of pDozenEgs and construction of pH4DozenEgs. 

The invention will now be further described in the Examples which follow, which are 
intended as an illustration only and do not limit the scope of the invention. 

15 EXAMPLES 

Reference Example 1 - Construction of a Ribozyme-encoding Genome 

The HIV gag.pol sequence was codon optimised (Figure 4 and SEQ LD. No. 1) and 
20 synthesised using overlapping oligos of around 40 nucleotides. This has three advantages. 
Firstly it allows an HIV based vector to carry ribozymes and other therapeutic factors. 
Secondly the codon optimisation generates a higher vector titre due to a higher level of 
gene expression. Thirdly gag.pol expression becomes rev independent which allows the 
use of anti-rev or RRE factors. 

25 

Conserved sequences within gag.pol were identified by reference to the HIV Sequence 
database at Los Alamos National Laboratory (http:// hiv-web.lanl.gov/) and used to design 
ribozymes. Because of the variability between subtypes of HIV-1 the ribozymes were 
designed to cleave the predominant subtype within North America, Latin America and the 
30 Caribbean, Europe, Japan and Australia; that is subtype B. The sites chosen were cross- 
referenced with the synthetic gagpol sequence to ensure that there was a low possibility of 
cutting the codon optimised gagpol mRNA. The ribozymes were designed with Xhol and 
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Sail sites at the 5' and 3' end respectively. This allows the construction of separate and 
tandem ribozymes. 

5 The ribozymes are hammerhead (Riddell et aL, 1996) structures of the following general 
structure: 

Helix I Helix II Helix III 

5 ' - NNNNNNNN- CUGAUGAGGCCGAAAGGCCGAA -NNNNNNNN- 

10 

The catalytic domain of the ribozyme (Helix II) can tolerate some changes without 
reducing catalytic turnover. 

The cleavage sites, targeting gag and pol, with the essential GUX triplet (where X is any 
15 nucleotide base) are as follows: 



GAG 


1 


5 1 


UAGUAAG AAUGUAUAG C C CUAC 


GAG 


2 


5 ' 


AACCCAGAUUGUAAGACUAUUU 


GAG 


3 


5 ' 


UGUUUCAAUUGUGGCAAAGAAG 


GAG 


4 


5 1 


AAAAAGGG CUGUUGG AAAUGUG 


POL 


1 


5 1 


ACGACCCCUCGUCACAAUAAAG 


POL 


2 


5 ' 


GGAAUUGGAGGUUUUAUCAAAG 


POL 


3 


5 » 


AUAUUUUUCAGUUCCCUUAGAU 


POL 


4 


5 ' 


UGGAUGAUUUGUAUGUAGGAUC 


POL 


5 


5 ' 


CUUUGGAUGGGUUAUGAACUCC 


POL 


6 


5 * 


CAGCUGGACUGUCAAUGACAUA 


POL 


7 


5 1 


AACUUUCUAUGUAGAUGGGGCA 


POL 


8 


5 ' 


AAGGCCGCCUGUUGGUGGGCAG 


POL 


9 


5 1 


UAAGACAGCAGUACAAAUGGCA 



30 

The ribozymes are inserted into four different HIV vectors (pH4 (Gervaix et aL, 1997), 
pH6, pH4.1, or pH6.1) (Figure 1). In pH4 and pH6, transcription of the ribozymes is 
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driven by an internal HCMV promoter (Foecking et aL, 1986). From pH4.1 and pH6.1, the 
ribozymes are expressed from the 5' LTR. The major difference between pH4 and pH6 
(and pH4.1 and pH6.1) resides in the 3' LTR in the production plasmid. pH4 and pH4.1 
have the HIV U3 in the 3' LTR. pH6 and pH6.1 have HCMV in the 3 'LTR. The HCMV 
promoter replaces most of the U3 and will drive expression at high constitutive levels 
while the HIV-1 U3 will support a high level of expression only in the presence of Tat. 

The HCMV/HIV-1 hybrid 3' LTR is created by recombinant PCR with three PCR primers 
(Figure 2). The first round of PCR is performed with RIB1 and RIB2 using pH4 (Kim et 
aL, 1998) as the template to amplify the HIV-1 HXB2 sequence 8900-9123. The second 
round of PCR makes the junction between the 5' end of the HIV-1 U3 and the HCMV 
promoter by amplifying the hybrid 5' LTR from pH4. The PCR product from the first PCR 
reaction and RIB3 serves as the 5* primer and 3' primer respectively. 

RIB1 : 5 ' -CAGCTGCTCGAGCAGCTGAAGCTTGCATGC-3 ' 

R I B 2 : 5 ' - GTAAGTTATGTAACGGACGATATCTTGTCTTCTT - 3 ' 

RIB3 : 5 ' - CGCATAGTCGACGGGCCCGCCACTGCTAGAGATTTTC - 3 ' 

The PCR product is then cut with Sphl and Sail and inserted into pH4 thereby replacing the 
3' LTR. The resulting plasmid is designated pH6. To construct pH4.1 and pH6.1, the 
internal HCMV promoter (Spel - Xhol) in pH4 and pH6 is replaced with the polycloning 
site of pBluescript II KS+ (Stratagene) (Spel -Xhol). 

The ribozymes are inserted into the Xhol sites in the genome vector backbones. Any 
ribozymes in any configuration could be used in a similar way. 

Reference Example 2 - Construction of a Packaging System 

The packaging system can take various forms. In a first form of packaging system, the 
HIV gag, pol components are co-expressed with the HIV env coding sequence. In this 
case, both the gag, pol and the env coding sequences are altered such that they are resistant 
to the anti-HIV ribozymes that are built into the genome. At the same time as altering the 
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codon usage to achieve resistance, the codons can be chosen to match the usage pattern of 
the most highly expressed mammalian genes. This dramatically increases expression 
levels and so increases titre. A codon optimised HIV env coding sequence has been 
described by Haas et al (1996). In the present example, a modified codon optimised HIV 
env sequence is used (SEQ I.D. No. 3). The corresponding env expression plasmid is 
designated pSYNgpl60mn. The modified sequence contains extra motifs not used by Haas 
et al. The extra sequences were taken from the HIV env sequence of strain MN and codon 
optimised. Any similar modification of the nucleic acid sequence would function similarly 
as long as it used codons corresponding to abundant tRNAs (Zolotukhin et al, 1996) and 
lead to resistance to the ribozymes in the genome. 

In one example of a gag, pol coding sequence with optimised codon usage, overlapping 
oligonucleotides are synthesised and then ligated together to produce the synthetic coding 
sequence. The sequence of a wild-type (Genbank accession no. K03455) and synthetic 
(gagpol-SYNgp) gagpol sequence is shown in SEQ I.D. Nos 1 and 2, respectively and their 
codon usage is shown in Figures 3 and 4, respectively. The sequence of a wild type env 
coding sequence (Genbank Accession No. M17449) is given in SEQ I.D. No 3, the 
sequence of a synthetic codon optimised sequence is given in SEQ. I.D. No. 4 and their 
codon usage tables are given in Figures 5 and 6, respectively. As with the env coding 
sequence any gag, pol sequence that achieves resistance to the ribozymes could be used. 
The synthetic sequence shown is designated gag, pol-SYNgp and has an EcoRl site at the 5' 
end and a Not\ site at the 3' end. It is inserted into pClneo (Promega) to produce plasmid 
pSYNgp. 

In a second form of the packaging system a synthetic gag, pol cassette is coexpressed with 
a non-HIV envelope coding sequence that produces a surface protein that pseudotypes 
HIV. This could be for example VSV-G (Ory et al, 1996; Zhu et al, 1990), amphotropic 
MLV env (Chesebro et al., 1990; Spector et al., 1990) or any other protein that would be 
incorporated into the HIV particle (Valsesia-Wittman, 1994). This includes molecules 
capable of targeting the vector to specific tissues. Coding sequences for non-HIV envelope 
proteins not cleaved by the ribozymes and so no sequence modification is required 
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(although some sequence modification may be desirable for other reasons such as 
optimisation for codon usage in mammalian cells). 



Reference Example 3 - Vector Particle Production 

Vector particles can be produced either from a transient three-plasmid transfection system 
similar to that described by Soneoka et al. (1995) or from producer cell lines similar to 
those used for other retroviral vectors (Ory et al, 1996; Srinivasakumar et aL, 1997; Yu et 
al., 1996). These principles are illustrated in Figures 7 and 8. For example, by using 
pH6Rz, pSYNgp and pRV67 (VSV-G expression plasmid) in a three plasmid transfection 
of 293T cells (Figure 8), as described by Soneoka et al (1995), vector particles designated 
H6R2-VSV are produced. These transduce the H6Rz genome to CD4+ cells such as 
CI 866 or Jurkat and produce the multitarget ribozymes. HIV replication in these cells is 
now severely restricted. 

Example 1 - Use of external guide sequences for inhibiting HIV 

Ribonuclease P is an nuclear localised enzyme consisting of protein and RNA subunits. It 
has been found in all organisms examined and is one of the most abundant, stable and 
efficient enzymes in cells. Its enzymatic activity is responsible for the maturation of the 5' 
termini of all tRNAs which account for about 2% of the total cellular RNA. 

For tRNA processing, it has been shown that RNAse P recognises a secondary structure of 
the tRNA. However extensive studies have shown that any complex of two RNA 
molecules which resemble the one tRNA molecule will also be recognised and cleaved by 
RNase P. Consequently the natural activity of RNase P can and has been successfully re- 
directed to target other RNA species (see Yaun and Altman, 1994, and references therein). 
This is achieved by engineering a sequence, containing the flanking motif recognised by 
RNaseP, to bind the desired target sequence. These sequences are called external guide 
sequence (EGSs). 
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Outlined here is a strategy employing the EGS system against HIV RNA. Shown in Figure 
2 A, B and C are twelve EGS sequences designed to target twelve separate HIV gag/pol 
sequences. These target sequences are conserved throughout the clade B of HIV. The 
sequence numbering in each figure designates the position of the required conserved G of 
5 each target sequences based on the HXB2 published sequence. 

The external guide sequences shown here all have anticodon stem-loops deleted. These are 
non-limiting examples; for instance full length 3/4 tRNA based EGSs might be used if 
preferred (see Yuan and Altaian, 1 994). 



Outlined in SEQ ID. Nos. 5 to 10 (see below) and Figure 11 is the cloning strategy 
employed to construct an HIV vector containing the EGSs described in SEQ ID. Nos. 5 to 
10. The oligonucleotides prefixed 1, 2 5 3, 4 5 5 and 6 are respectively annealed together and 
sequentially. cloned into the pSP72 (Promega) cloning vector starting with the oligo. duplex 

15 1/1 A being cloned into the Xhol-SaR site such that the EGS 4762 and EGS 4715 are 
orientated away from the ampicillin gene. The remaining oligonucleotides (with Xhol ends) 
are subsequently cloned stepwise (starting with oligo. duplex 2/2A, ending with duplex 
6/6A) into the unique Sail site (present within the terminus of the each preceding 
oligonucleotide) to create the plasmid pDOZENEGS. The EGSs from this vector are then 

20 transferred by XhoVSphl digest into the pH4Z similarily cut such that the multiple EGSs 
cassette replaces the lacZ gene of pH4Z (Kim et ah, 1998). The resulting vector is named 
pH4DOZENEGS (see SEQ ID. No. 1 1 for complete sequence). 



5 ' - tcgagcccggggatgacgtcatcgacttcgaaggttcgaatccttctactgccaccatttttt 
cgggcccctactgcagtagctgaagcttccaagcttaggaagatgacggtggtaaaaaa 

ctctacgtcatcgacttcgaaggttcgaatccttccctgtccaccagtcgacc-3 ' 
30 gagatgcagtagctgaagcttccaagcttaggaagggacaggtggtcagctggagct-5' 

Egs 2/2A (SEQ ID. No. 6) 

5 * - tcgagtattacgtcatcgacttcgaaggttcgaatccttctagattcaccattttttaggaacg 
35 cataatgcagtagctgaagcttccaagcttaggaagtactaagtggtaaaaaatccttgc 



10 




Egs 1/1 A (SEQ ID. No. 5) 



25 



Xhol 



10 
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tcatcgacttcgaaggttcgaatccttccagttccaccagtcgacc-3' 
agtagctgaagcttccaagcttaggaaggtcaaggtggtcagctggagct-5' 

Egs 3/3A (SEQ ID. No. 7) 

5 ' - tcgaggccaacgtcatcgacttcgaaggttcgaatccttctcttcccaccattttttttcc 
ccggttgcagtagctgaagcttccaagcttaggaagagaagggtggtaaaaaaaagg 

acgtcatcgacttcgaaggttcgaatccttcggggcccaccagtcgacc-3* 
tgcagtagctgaagcttccaagcttaggaagccccgggtggtcagctggagct-5' 

Egs 4/4A (SEQ ID. No. 8) 



5 ' - tcgagggctacgtcatcgacttcgaaggttcgaatccttcttgcttcaccatttttt 
cccgatgcagtagctgaagcttccaagcttaggaagaacgaagtggtaaaaaa 

is ctgaacgtcatcgacttcgaaggttcgaatccttctgctgtcaccagtcgacc-3' 

gacttgcagtagctgaagcttccaagcttaggaagacgacagtggtcagctggagct-5' 

Egs5/5A (SEQ ID. No. 9) 

20 5' - tcgagtataacgtcatcgacttcgaaggttcgaatccttcaccggtcaccatttttttata 
catattgcagtagctgaagcttccaagcttaggaagtggccagtggtaaaaaaatat 

acgtcatcgacttcgaaggttcgaatccttcttcttacaccagtcgacc-3' 
tgcagtagctgaagcttccaagcttaggaagaagaatgtggtcagctggagct-5' 

25 Egs 6/6A (SEQ ID. No. 10) 

5 ' - tcgaggtacacgtcatcgacttcgaaggttcgaatccttcgtagttcaccattttttgtgc 
ccatgtgcagtagctgaagcttccaagcttaggaagcatcaagtggtaaaaaacacg 

SphI 

30 acgtcatcgacttcgaaggttcgaatccttctaggcccaccagtcgacgcatgcc-3' 

tgcagtagctgaagcttccaagcttaggaagatccgggtggtcagctgcgtacggagct-5' 



The pH4DOZENEGS_ vector may be used to both deliver and express the example EGS 
sequences to appropriate eukaryotic cells in a manner as described for ribozymes in 

35 reference examples 1 , 2 and 3 whereby the use of a codon optimised gag/pol and env genes 
would prevent EGSs from targeting these genes during viral production. The inclusion of 
the EGS sequences into an HIV derived vector will not only allow expression of such 
sequences in the target cell but also packaging and transfer of such therapeutic sequences 
by the patient's own HIV. These example EGS sequences target HIV RNA for cleavage 

40 by RNAse P. This example is not limiting arid other suitable EGS and derived sequences 
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may also be used; be they expressed singularly, in multiples, from pol I, pol II or pol III 
promoters and derivatives thereof and/or in combination with other HIV treatments. Other 
appropriate nucleotide sequences of interest (NOIs) may also be included in combination 
with EGSs if preferred. 

5 

All publications mentioned in the above specification are herein incorporated by reference. 
Various modifications and variations of the described methods and system of the invention 
will be apparent to those skilled in the art without departing from the scope and spirit of the 
invention. Although the invention has been described in connection with specific preferred 
1 o embodiments, it should be understood that the invention as claimed should not be unduly 
limited to such specific embodiments. Indeed, various modifications of the described 
modes for carrying out the invention which are obvious to those skilled in molecular 
biology or related fields are intended to be within the scope of the following claims. 
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SEQUENCE LISTING PART OF THE DESCRIPTTON 



SEQ. ID. NO. 1 - Wild type gagpol sequ 

ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG 
TTAAGGCCAG GGGGAAAGAA AAAATATAAA 
CTAGAACGAT TCGCAGTTAA TCCTGGCCTG 
CTGGGACAGC TACAACCATC CCTTCAGACA 
ACAGTAGCAA CCCTCTATTG TGTGCATCAA 
TTAGACAAGA TAGAGGAAGA GCAAAACAAA 
GACACAGGAC ACAGCAATCA GGTCAGCCAA 
CAAATGGTAC ATCAGGCCAT ATCACCTAGA 
GAGAAGGCTT TCAGCCCAGA AGTGATACCC 
CCACAAGATT TAAACACCAT GCTAAACACA 
TTAAAAGAGA CCATCAATGA GGAAGCTGCA 
GGGCCTATTG CACCAGGCCA GATGAGAGAA 
AGTACCCTTC AGGAACAAAT AGGATGGATG 
ATTTATAAAA GATGGATAAT CCTGGGATTA 
AGCATTCTGG ACATAAGACA AGGACCAAAG 
TATAAAACTC TAAGAGCCGA GCAAGCTTCA 
TTGTTGGTCC AAAATGCGAA CCCAGATTGT 
GCTACACTAG AAGAAATGAT GACAGCATGT 
AGAGTTTTGG CTGAAGCAAT GAGCCAAGTA 
GGCAATTTTA GGAACCAAAG AAAGATTGTT 
ACAGCCAGAA ATTGCAGGGC CCCTAGGAAA 
CACCAAATGA AAGATTGTAC TGAGAGACAG 
TACAAGGGAA GGCCAGGGAA TTTTCTTCAG 
GAGAGCTTCA GGTCTGGGGT AGAGACAACA 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGG 
TAAAGATAGG GGGGCAACTA AAGGAAGCTC 
TAGAAGAAAT GAGTTTGCCA GGAAGATGGA 
TTATCAAAGT AAGACAGTAT GATCAGATAC 
GTACAGTATT AGTAGGACCT ACACCTGTCA 
TTGGTTGCAC TTTAAATTTT CCCATTAGCC 
CAGGAATGGA TGGCCCAAAA GTTAAACAAT 
TAGTAGAAAT TTGTACAGAG ATGGAAAAGG 
ATCCATACAA TACTCCAGTA TTTGCCATAA 
TAGTAGATTT CAGAGAACTT AATAAGAGAA 
TACCACATCC CGCAGGGTTA AAAAAGAAAA 
CATATTTTTC AGTTCCCTTA GATGAAGACT 
GTATAAACAA TGAGACACCA GGGATTAGAT 
AAGGATCACC AGCAATATTC CAAAGTAGCA 
AAAATCCAGA CATAGTTATC TATGAATACA 
AAATAGGGCA GCATAGAACA AAAATAGAGG 
TTACCACACC AGACAAAAAA CATCAGAAAG 
TCCATCCTGA TAAATGGACA GTACAGCCTA 
TCAATGACAT ACAGAAGTTA GTGGGGAAAT 



ence for strain HXB2 (accession no. K03455) 

GGAGAATTAG ATCGATGGGA AAAAATTCGG 60 

TTAAAACATA TAGTATGGGC AAGCAGGGAG 120 

TTAGAAACAT CAGAAGGCTG TAGACAAATA 180 

GGATCAGAAG AACTTAGATC ATTATATAAT 240 

AGGATAGAGA TAAAAGACAC CAAGGAAGCT 300 

AGTAAGAAAA AAGCACAGCA AGCAGCAGCT 360 

AATTACCCTA TAGTGCAGAA CATCCAGGGG 420 

ACTTTAAATG CATGGGTAAA AGTAGTAGAA 480 

ATGTTTTCAG CATTATCAGA AGGAGCCACC 540 

GTGGGGGGAC ATCAAGCAGC CATGCAAATG 600 

GAATGGGATA GAGTGCATCC AGTGCATGCA 660 

CCAAGGGGAA GTGACATAGC AGGAACTACT 720 

ACAAATAATC CACCTATCCC AGTAGGAGAA 780 

AATAAAATAG TAAGAATGTA TAGCCCTACC 840 

GAACCCTTTA GAGACTATGT AGACCGGTTC 900 

CAGGAGGTAA AAAATTGGAT GACAGAAACC 960 

AAGACTATTT TAAAAGCATT GGGACCAGCG 1020 

CAGGGAGTAG GAGGACCCGG CCATAAGGCA 1080 

ACAAATTCAG CTACCATAAT GATGCAGAGA 1140 

AAGTGTTTCA ATTGTGGCAA AGAAGGGCAC 1200 

AAGGGCTGTT GGAAATGTGG AAAGGAAGGA 1260 

GCTAATTTTT TAGGGAAGAT CTGGCCTTCC 1320 

AGCAGACCAG AGCCAACAGC CCCACCAGAA 1380 

ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 1440 

TCACTCTTTG GCAACGACCC CTCGTCACAA 1500 

TATTAGATAC AGGAGCAGAT GATACAGTAT 1560 

AACCAAAAAT GATAGGGGGA ATTGGAGGTT 1620 

TCATAGAAAT CTGTGGACAT AAAGCTATAG 1680 

ACATAATTGG AAGAAATCTG TTGACTCAGA 1740 

CTATTGAGAC TGTACCAGTA AAATTAAAGC 1800 

GGCCATTGAC AGAAGAAAAA ATAAAAGCAT 1860 

AAGGGAAAAT TTCAAAAATT GGGCCTGAAA 1920 

AGAAAAAAGA CAGTACTAAA TGGAGAAAAT 1980 

CTCAAGACTT CTGGGAAGTT CAATTAGGAA 2040 

AATCAGTAAC AGTACTGGAT GTGGGTGATG 2100 

TCAGGAAGTA TACTGCATTT ACCATACCTA 2160 

ATCAGTACAA TGTGCTTCCA CAGGGATGGA 2220 

TGACAAAAAT CTTAGAGCCT TTTAGAAAAC 2280 

TGGATGATTT GTATGTAGGA TCTGACTTAG 2340 

AGCTGAGACA ACATCTGTTG AGGTGGGGAC 2400 

AACCTCCATT CCTTTGGATG GGTTATGAAC 2460 

TAGTGCTGCC AGAAAAAGAC AGCTGGACTG 2520 
TGAATTGGGC AAGTCAGATT TACCCAGGGA 2580 



-32- 



P006478GB ATM 



TTAAAGTAAG GCAATTATGT AAACTCCTTA GAGGAACCAA AGCACTAACA GAAGTAATAC 2640 
CACTAACAGA AGAAGCAGAG CTAGAACTGG CAGAAAACAG AGAGATTCTA AAAGAACCAG 2700 
TACATGGAGT GTATTATGAC CCATCAAAAG ACTTAATAGC AGAAATACAG AAGCAGGGGC 2760 
AAGGCCAATG GACATATCAA ATTTATCAAG AGCCATTTAA AAATCTGAAA ACAGGAAAAT 2820 
ATGCAAGAAT GAGGGGTGCC CACACTAATG ATGTAAAACA ATTAACAGAG GCAGTGCAAA 2880 
AAATAACCAC AGAAAGCATA GTAATATGGG GAAAGACTCC TAAATTTAAA CTGCCCATAC 2940 
AAAAGGAAAC ATGGGAAACA TGGTGGACAG AGTATTGGCA AGCCACCTGG ATTCCTGAGT 3000 
GGGAGTTTGT TAATACCCCT CCCTTAGTGA AATTATGGTA CCAGTTAGAG AAAGAACCCA 3060 
TAGTAGGAGC AGAAACCTTC TATGTAGATG GGGCAGCTAA CAGGGAGACT AAATTAGGAA 3120 
AAGCAGGATA TGTTACTAAT AGAGGAAGAC AAAAAGTTGT CACCCTAACT GACACAACAA 3180 
ATCAGAAGAC TGAGTTACAA GCAATTTATC TAGCTTTGCA GGATTCGGGA TTAGAAGTAA 3240 
ACATAGTAAC AGACTCACAA TATGCATTAG GAATCATTCA AGCACAACCA GATCAAAGTG 3300 
AATCAGAGTT AGTCAATCAA ATAATAGAGC AGTTAATAAA AAAGGAAAAG GTCTATCTGG 3360 
CATGGGTACC AGCACACAAA GGAATTGGAG GAAATGAACA AGTAGATAAA TTAGTCAGTG 3420 
CTGGAATCAG GAAAGTACTA TTTTTAGATG GAATAGATAA GGCCCAAGAT GAACATGAGA 3480 
AATATCACAG TAATTGGAGA GCAATGGCTA GTGATTTTAA CCTGCCACCT GTAGTAGCAA 3540 
AAGAAATAGT AGCCAGCTGT GATAAATGTC AGCTAAAAGG AGAAGCCATG CATGGACAAG 3600 
TAGACTGTAG TCCAGGAATA TGGCAACTAG ATTGTACACA TTTAGAAGGA AAAGTTATCC 3660 
TGGTAGCAGT TCATGTAGCC AGTGGATATA TAGAAGCAGA AGTTATTCCA GCAGAAACAG 3720 
GGCAGGAAAC AGCATATTTT CTTTTAAAAT TAGCAGGAAG ATGGCCAGTA AAAACAATAC 3780 
ATACTGACAA TGGCAGCAAT TTCACCGGTG CTACGGTTAG GGCCGCCTGT TGGTGGGCGG 3840 
GAATCAAGCA GGAATTTGGA ATTCCCTACA ATCCCCAAAG TCAAGGAGTA GTAGAATCTA 3900 
TGAATAAAGA ATTAAAGAAA ATTATAGGAC AGGTAAGAGA TCAGGCTGAA CATCTTAAGA 3960 
CAGCAGTACA AATGGCAGTA TTCATCCACA ATTTTAAAAG AAAAGGGGGG ATTGGGGGGT 4020 
ACAGTGCAGG GGAAAGAATA GTAGACATAA TAGCAACAGA CATACAAACT AAAGAATTAC 4080 
AAAAACAAAT TACAAAAATT CAAAATTTTC GGGTTTATTA CAGGGACAGC AGAAATTCAC 4140 
TTTGGAAAGG ACCAGCAAAG CTCCTCTGGA AAGGTGAAGG GGCAGTAGTA ATACAAGATA 4200 
ATAGTGACAT AAAAGTAGTG CCAAGAAGAA AAGCAAAGAT CATTAGGGAT TATGGAAAAC 4260 
AGATGGCAGG TGATGATTGT GTGGCAAGTA GACAGGATGA GGATTAG 4307 



SEQ I.D. NO. 2 - gagpol-SYNgp - codon optimised gagpol sequence 

ATGGGCGCCC GCGCCAGCGT GCTGTCGGGC GGCGAGCTGG ACCGCTGGGA GAAGATCCGC 60 
CTGCGCCCCG GCGGCAAAAA GAAGTACAAG CTGAAGCACA TCGTGTGGGC CAGCCGCGAA 120 
CTGGAGCGCT TCGCCGTGAA CCCCGGGCTC CTGGAGACCA GCGAGGGGTG CCGCCAGATC 180 
CTCGGCCAAC TGCAGCCCAG CCTGCAAACC GGCAGCGAGG AGCTGCGCAG CCTGTACAAC 240 
ACCGTGGCCA CGCTGTACTG CGTCCACCAG CGCATCGAAA TCAAGGATAC GAAAGAGGCC 300 
CTGGATAAAA TCGAAGAGGA ACAGAATAAG AGCAAAAAGA AGGCCCAACA GGCCGCCGCG 360 
GACACCGGAC ACAGCAACCA GGTCAGCCAG AACTACCCCA TCGTGCAGAA CATCCAGGGG 420 
CAGATGGTGC ACCAGGCCAT CTCCCCCCGC ACGCTGAACG CCTGGGTGAA GGTGGTGGAA 480 
GAGAAGGCTT TTAGCCCGGA GGTGATACCC ATGTTCTCAG CCCTGTCAGA GGGAGCCACC 540 
CCCCAAGATC TGAACACCAT GCTCAACACA GTGGGGGGAC ACCAGGCCGC CATGCAGATG 600 
CTGAAGGAGA CCATCAATGA GGAGGCTGCC GAATGGGATC GTGTGCATCC GGTGCACGCA 660 
GGGCCCATCG CACCGGGCCA GATGCGTGAG CCACGGGGCT CAGACATCGC CGGAACGACT 720 
AGTACCCTTC AGGAACAGAT CGGCTGGATG ACCAACAACC CACCCATCCC GGTGGGAGAA 780 
ATCTACAAAC GCTGGATCAT CCTGGGCCTG AACAAGATCG TGCGCATGTA TAGCCCTACC 840 
AGCATCCTGG ACATCCGCCA AGGCCCGAAG GAACCCTTTC GCGACTACGT GGACCGGTTC 900 
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TACAAAACGC TCCGCGCCGA GCAGGCTAGC 
CTGCTGGTCC AGAACGCGAA CCCGGACTGC 
GCTACCCTAG AGGAAATGAT GACCGCCTGT 
CGCGTCCTGG CTGAGGCCAT GAGCCAGGTG 
GGCAACTTTC GGAACCAACG CAAGATCGTC 
ACAGCCCGCA ACTGCAGGGC CCCTAGGAAA 
CACCAGATGA AAGACTGTAC TGAGAGACAG 
TACAAGGGAA GGCCAGGGAA TTTTCTTCAG 
GAGAGCTTCA GGTCTGGGGT AGAGACAACA 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGA 
TAAAGATAGG GGGGCAGCTC AAGGAGGCTC 
TGGAGGAGAT GTCGTTGCCA GGCCGCTGGA 
TCATCAAGGT GCGCCAGTAT GACCAGATCC 
GTACCGTGCT GGTGGGCCCC ACACCCGTCA 
TCGGTTGCAC GCTGAACTTC CCCATTAGCC 
CCGGGATGGA CGGCCCGAAG GTCAAGCAAT 
TGGTGGAGAT TTGCACAGAG ATGGAAAAGG 
ACCCGTACAA CACGCCGGTG TTCGCAATCA 
TGGTGGACTT CCGCGAGCTG AACAAGCGCA 
TCCCGCACCC CGCAGGGCTG AAGAAGAAGA 
CCTACTTCTC CGTTCCCCTG GACGAAGACT 
CGATCAACAA CGAGACACCG GGGATTCGAT 
AAGGCTCTCC CGCAATCTTC CAGAGTAGCA 
AGAACCCCGA CATCGTCATC TATCAGTACA 
AGATAGGGCA GCACCGCACC AAGATCGAGG 
TGACCACACC CGACAAGAAG CACCAGAAGG 
TGCACCCTGA CAAATGGACC GTGCAGCCTA 
TCAACGACAT ACAGAAGCTG GTGGGGAAGT 
TTAAGGTGAG GCAGCTGTGC AAACTCCTCC 
CCCTAACCGA GGAGGCCGAG CTCGAACTGG 
TGCACGGCGT GTACTATGAC CCCTCCAAGG 
AAGGCCAGTG GACCTATCAG ATTTACCAGG 
ACGCCCGGAT GAGGGGTGCC CACACTAACG 
AGATCACCAC CGAAAGCATC GTGATCTGGG 
AGAAGGAAAC CTGGGAAACC TGGTGGACAG 
GGGAGTTCGT CAACACCCCT CCCCTGGTGA 
TAGTGGGCGC CGAAACCTTC TACGTGGATG 
AAGCCGGATA CGTCACTAAC CGGGGCAGAC 
ACCAGAAGAC TGAGCTGCAG GCCATTTACC 
ACATCGTGAC AGACTCTCAG TATGCCCTGG 
AGTCCGAGCT GGTCAATCAG ATCATCGAGC 
CCTGGGTACC CGCCCACAAA GGCATTGGCG 
CTGGCATCAG GAAGGTGCTA TTCCTGGATG 
AATACCACAG CAACTGGCGG GCCATGGCTA 
AAGAGATCGT GGCCAGCTGT GACAAGTGTC 
TGGACTGTAG CCCCGGCATC TGGCAACTCG 
TGGTAGCCGT CCATGTGGCC AGTGGCTACA 
GGCAGGAGAC AGCCTACTTC CTCCTGAAGC 



CAGGAGGTGA AGAACTGGAT GACCGAAACC 960 
AAGACGATCC TGAAGGCCCT GGGCCCAGCG 1020 
CAGGGAGTGG GCGGACCCGG CCACAAGGCA 1080 
ACCAACTCCG CTACCATCAT GATGCAGCGC 1140 
AAGTGCTTCA ACTGTGGCAA AGAAGGGCAC 1200 
AAGGGCTGCT GGAAATGCGG CAAGGAAGGC 1260 
GCTAATTTTT TAGGGAAGAT CTGGCCTTCC 1320 
AGCAGACCAG AGCCAACAGC CCCACCAGAA 1380 
ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 1440 
TCACTCTTTG GCAACGACCC CTCGTCACAA 1500 
TCCTGGACAC CGGAGCAGAC GACACCGTGC 1560 
AGCCGAAGAT GATCGGGGGA ATCGGCGGTT 1620 
TCATCGAAAT CTGCGGCCAC AAGGCTATCG 1680 
ACATCATCGG ACGCAACCTG TTGACGCAGA 1740 
CTATCGAGAC GGTACCGGTG AAGCTGAAGC 1800 
GGCCATTGAC AGAGGAGAAG ATCAAGGCAC 1860 
AAGGGAAAAT CTCCAAGATT GGGCCTGAGA 1920 
AGAAGAAGGA CTCGACGAAA TGGCGCAAGC 1980 
CGCAAGACTT CTGGGAGGTT CAGCTGGGCA 2040 
AATCCGTGAC CGTACTGGAT GTGGGTGATG 2100 
TCAGGAAGTA CACTGCCTTC ACAATCCCTT 2160 
ATCAGTACAA CGTGCTGCCC CAGGGCTGGA 2220 
TGACCAAAAT CCTGGAGCCT TTCCGCAAAC 2280 
TGGATGACTT GTACGTGGGC TCTGATCTAG 2340 
AGCTGCGCCA GCACCTGTTG AGGTGGGGAC 2400 
AGCCTCCCTT • CCTCTGGATG GGTTACGAGC 2460 
TCGTGCTGCC AGAGAAAGAC AGCTGGACTG 2520 
TGAACTGGGC CAGTCAGATT TACCCAGGGA 2580 
GCGGAACCAA GGCACTCACA GAGGTGATCC 2640 
CAGAAAACCG AGAGATCCTA AAGGAGCCCG 2700 
ACCTGATCGC CGAGATCCAG AAGCAGGGGC 2760 
AGCCCTTCAA GAACCTGAAG ACCGGCAAGT 2820 
ACGTCAAGCA GCTGACCGAG GCCGTGCAGA 2880 
GAAAGACTCC TAAGTTCAAG CTGCCCATCC 2940 
AGTATTGGCA GGCCACCTGG ATTCCTGAGT 3000 
AGCTGTGGTA CCAGCTGGAG AAGGAGCCCA 3060 
GGGCCGCTAA CAGGGAGACT AAGCTGGGCA 3120 
AGAAGGTTGT CACCCTCACT GACACCACCA 3180 
TCGCTTTGCA GGACTCGGGC CTGGAGGTGA 3240 
GCATCATTCA AGCCCAGCCA GACCAGAGTG 3300 
AGCTGATCAA GAAGGAAAAG GTCTATCTGG 3360 
GCAATGAGCA GGTCGACAAG CTGGTCTCGG 3420 
GCATCGACAA GGCCCAGGAC GAGCACGAGA 3480 
GCGACTTCAA CCTGCCCCCT GTGGTGGCCA 3540 
AGCTCAAGGG CGAAGCCATG CATGGCCAGG 3600 
ATTGCACCCA TCTGGAGGGC AAGGTTATCC 3660 
TCGAGGCCGA GGTCATTCCC GCCGAAACAG 3720 
TGGCAGGCCG GTGGCCAGTG AAGACCATCC 3780 
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ATACTGACAA TGGCAGCAAT TTCACCAGTG 
GAATCAAGCA GGAGTTCGGG ATCCCCTACA 
TGAATAAGGA GTTAAAGAAG ATTATCGGCC 
CCGCGGTCCA AATGGCGGTA TTCATCCACA 
ACAGTGCGGG GGAGCGGATC GTGGACATCA 
AAAAGCAGAT TACCAAGATT CAGAATTTCC 
TCTGGAAAGG CCCAGCGAAG CTCCTCTGGA 
ATAGCGACAT CAAGGTGGTG CCCAGAAGAA 
AGATGGCGGG TGATGATTGC GTGGCGAGCA 



CTACGGTTAA GGCCGCCTGC TGGTGGGCGG 3840 
ATCCCCAGAG TCAGGGCGTC GTCGAGTCTA 3900 
AGGTCAGAGA TCAGGCTGAG CATCTCAAGA 3960 
ATTTCAAGCG GAAGGGGGGG ATTGGGGGGT 4020 
TCGCGACCGA CATCCAGACT AAGGAGCTGC 4080 
GGGTCTACTA CAGGGACAGC AGAAATCCCC 4140 
AGGGTGAGGG GGCAGTAGTG ATCCAGGATA 4200 
AGGCGAAGAT CATTAGGGAT TATGGCAAAC 4260 
GACAGGATGA GGATTAG 4307 



SEQ. ID. NO. 3 - Envelope Gene from HIV-1 MN (Genbank accession no. Ml 7449) 

ATGAGAGTGA AGGGGATCAG GAGGAATTAT CAGCACTGGT GGGGATGGGG CACGATGCTC 60 
CTTGGGTTAT TAATGATCTG TAGTGCTACA GAAAAATTGT GGGTCACAGT CTATTATGGG 120 
GTACCTGTGT GGAAAGAAGC AACCACCACT CTATTTTGTG CATCAGATGC TAAAGCATAT 180 
GATACAGAGG TACATAATGT TTGGGCCACA CAAGCCTGTG TACCCACAGA CCCCAACCCA 240 
CAAGAAGTAG AATTGGTAAA TGTGACAGAA AATTTTAACA TGTGGAAAAA TAACATGGTA 300 
GAACAGATGC ATGAGGATAT AATCAGTTTA TGGGATCAAA GCCTAAAGCC ATGTGTAAAA 360 
TTAACCCCAC TCTGTGTTAC TTTAAATTGC ACTGATTTGA GGAATACTAC TAATACCAAT 420 
AATAGTACTG CTAATAACAA TAGTAATAGC GAGGGAACAA TAAAGGGAGG AGAAATGAAA 480 
AACTGCTCTT TCAATATCAC CACAAGCATA AGAGATAAGA TGCAGAAAGA ATATGCACTT 540 
CTTTATAAAC TTGATATAGT ATCAATAGAT AATGATAGTA CCAGCTATAG GTTGATAAGT 600 
TGTAATACCT CAGTCATTAC ACAAGCTTGT CCAAAGATAT CCTTTGAGCC AATTCCCATA 660 
CACTATTGTG CCCCGGCTGG TTTTGCGATT CTAAAATGTA ACGATAAAAA GTTCAGTGGA 720 
AAAGGATCAT GTAAAAATGT CAGCACAGTA CAATGTACAC ATGGAATTAG GCCAGTAGTA 780 
TCAACTCAAC TGCTGTTAAA TGGCAGTCTA GCAGAAGAAG AGGTAGTAAT TAGATCTGAG 840 
AATTTCACTG ATAATGCTAA AACCATCATA GTACATCTGA ATGAATCTGT ACAAATTAAT 900 
TGTACAAGAC CCAACTACAA TAAAAGAAAA AGGATACATA TAGGACCAGG GAGAGCATTT 960 
TATACAACAA AAAATATAAT AGGAACTATA AGACAAGCAC ATTGTAACAT TAGTAGAGCA 1020 
AAATGGAATG ACACTTTAAG ACAGATAGTT AGCAAATTAA AAGAACAATT TAAGAATAAA 1080 
ACAATAGTCT TT AATCA ATC CTCAGGAGGG GACCCAGAAA TTGTAATGCA CAGTTTTAAT 1140 
TGTGGAGGGG AATTTTTCTA CTGTAATACA TCACCACTGT TTAATAGTAC TTGGAATGGT 1200 
AATAATACTT GGAATAATAC TACAGGGTCA AATAACAATA TCACACTTCA ATGCAAAATA 1260 
AAACAAATTA TAAACATGTG GCAGGAAGTA GGAAAAGCAA TGTATGCCCC TCCCATTGAA 1320 
GGACAAATTA GATGTTCATC AAATATTACA GGGCTACTAT TAACAAGAGA TGGTGGTAAG 1380 
GACACGGACA CGAACGACAC CGAGATCTTC AGACCTGGAG GAGGAGATAT GAGGGACAAT 1440 
TGGAGAAGTG AATTATATAA ATATAAAGTA GTAACAATTG AACCATTAGG AGTAGCACCC 1500 
ACCAAGGCAA AGAGAAGAGT GGTGCAGAGA GAAAAAAGAG CAGCGATAGG AGCTCTGTTC 1560 
CTTGGGTTCT TAGGAGCAGC AGGAAGCACT ATGGGCGCAG CGTCAGTGAC GCTGACGGTA 1620 
CAGGCCAGAC TATTATTGTC TGGTATAGTG CAACAGCAGA ACAATTTGCT GAGGGCCATT 1680 
GAGGCGCAAC AGCATATGTT GCAACTCACA GTCTGGGGCA TCAAGCAGCT CCAGGCAAGA 1740 
GTCCTGGCTG TGGAAAGATA CCTAAAGGAT CAACAGCTCC TGGGGTTTTG GGGTTGCTCT 1800 
GGAAAACTCA TTTGCACCAC TACTGTGCCT TGGAATGCTA GTTGGAGTAA TAAATCTCTG 1860 
GATGATATTT GGAATAACAT GACCTGGATG CAGTGGGAAA GAGAAATTGA CAATTACACA 1920 
AGCTTAATAT ACTCATTACT AGAAAAATCG CAAACCCAAC AAGAAAAGAA TGAACAAGAA 1980 
TTATTGGAAT TGGATAAATG GGCAAGTTTG TGGAATTGGT TTGACATAAC AAATTGGCTG 2040 
TGGTATATAA AAATATTCAT AATGATAGTA GGAGGCTTGG TAGGTTTAAG AATAGTTTTT 2100 
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GCTGTACTTT CTATAGTGAA TAGAGTTAGG CAGGGATACT CACCATTGTC GTTGCAGACC 2160 
CGCCCCCCAG TTCCGAGGGG ACCCGACAGG CCCGAAGGAA TCGAAGAAGA AGGTGGAGAG 2220 
AGAGACAGAG ACACATCCGG TCGATTAGTG CATGGATTCT TAGCAATTAT CTGGGTCGAC 2280 
CTGCGGAGCC TGTTCCTCTT CAGCTACCAC CACAGAGACT TACTCTTGAT TGCAGCGAGG 2340 
ATTGTGGAAC TTCTGGGACG CAGGGGGTGG GAAGTCCTCA AATATTGGTG GAATCTCCTA 2400 
CAGTATTGGA GTCAGGAACT AAAGAGTAGT GCTGTTAGCT TGCTTAATGC CACAGCTATA 2460 
GCAGTAGCTG AGGGGACAGA TAGGGTTATA GAAGTACTGC AAAGAGCTGG TAGAGCTATT 2520 
CTCCACATAC CTACAAGAAT AAGACAGGGC TTGGAAAGGG CTTTGCTATA A 2571 



SEQ. I.D. NO. 4 - SYNgp-160mn - codon optimised env sequence 

ATGAGGGTGA AGGGGATCCG CCGCAACTAC CAGCACTGGT GGGGCTGGGG CACGATGCTC 60 
CTGGGGCTGC TGATGATCTG CAGCGCCACC GAGAAGCTGT GGGTGACCGT GTACTACGGC 120 
GTGCCCGTGT GGAAGGAGGC CACCACCACC CTGTTCTGCG CCAGCGACGC CAAGGCGTAC 180 
GACACCGAGG TGCACAACGT GTGGGCCACC CAGGCGTGCG TGCCCACCGA CCCCAACCCC 240 
CAGGAGGTGG AGCTCGTGAA CGTGACCGAG AACTTCAACA TGTGGAAGAA CAACATGGTG 300 
GAGCAGATGC ATGAGGACAT CATCAGCCTG TGGGACCAGA GCCTGAAGCC CTGCGTGAAG 360 
CTGACCCCCC TGTGCGTGAC CCTGAACTGC ACCGACCTGA GGAACACCAC CAACACCAAC 420 
AACAGCACCG CCAACAACAA CAGCAACAGC GAGGGCACCA TCAAGGGCGG CGAGATGAAG 480 
AACTGCAGCT TCAACATCAC CACCAGCATC CGCGACAAGA TGCAGAAGGA GTACGCCCTG 540 
CTGTACAAGC TGGATATCGT GAGCATCGAC AACGACAGCA CCAGCTACCG CCTGATCTCC 600 
TGCAACACCA GCGTGATCAC CCAGGCCTGC CCCAAGATCA GCTTCGAGCC CATCCCCATC 660 
CACTACTGCG CCCCCGCCGG CTTCGCCATC CTGAAGTGCA ACGACAAGAA GTTCAGCGGC 720 
AAGGGCAGCT GCAAGAACGT GAGCACCGTG CAGTGCACCC ACGGCATCCG GCCGGTGGTG 780 
AGCACCCAGC TCCTGCTGAA CGGCAGCCTG GCCGAGGAGG AGGTGGTGAT CCGCAGCGAG 840 
AACTTCACCG ACAACGCCAA GACCATCATC GTGCACCTGA ATGAGAGCGT GCAGATCAAC 900 
TGCACGCGTC CCAACTACAA CAAGCGCAAG CGCATCCACA TCGGCCCCGG GCGCGCCTTC 960 
TACACCACCA AGAACATCAT CGGCACCATC CGCCAGGCCC ACTGCAACAT CTCTAGAGCC 1020 
AAGTGGAACG ACACCCTGCG CCAGATCGTG AGCAAGCTGA AGGAGCAGTT CAAGAACAAG 1080 
ACCATCGTGT TCAACCAGAG CAGCGGCGGC GACCCCGAGA TCGTGATGCA CAGCTTCAAC 1140 
TGCGGCGGCG AATTCTTCTA CTGCAACACC AGCCCCCTGT TCAACAGCAC CTGGAACGGC 1200 
AACAACACCT GGAACAACAC CACCGGCAGC AACAACAATA TTACCCTCCA GTGCAAGATC 1260 
AAGCAGATCA TCAACATGTG GCAGGAGGTG GGCAAGGCCA TGTACGCCCC CCCCATCGAG 1320 
GGCCAGATCC GGTGCAGCAG CAACATCACC GGTCTGCTGC TGACCCGCGA CGGCGGCAAG. 1380 
GACACCGACA CCAACGACAC CGAAATCTTC CGCCCCGGCG GCGGCGACAT GCGCGACAAC 1440 
TGGAGATCTG AGCTGTACAA GTACAAGGTG GTGACGATCG AGCCCCTGGG CGTGGCCCCC 1500 
ACCAAGGCCA AGCGCCGCGT GGTGCAGCGC GAGAAGCGGG CCGCCATCGG CGCCCTGTTC 1560 
CTGGGCTTCC TGGGGGCGGC GGGCAGCACC ATGGGGGCCG CCAGCGTGAC CCTGACCGTG 1620 
CAGGCCCGCC TGCTCCTGAG CGGCATCGTG CAGCAGCAGA ACAACCTCCT CCGCGCCATC 1680 
GAGGCCCAGC AGCATATGCT CCAGCTCACC GTGTGGGGCA TCAAGCAGCT CCAGGCCCGC 1740 
GTGCTGGCCG TGGAGCGCTA CCTGAAGGAC CAGCAGCTCC TGGGCTTCTG GGGCTGCTCC 1800 
GGCAAGCTGA TCTGCACCAC CACGGTACCC TGGAACGCCT CCTGGAGCAA CAAGAGCCTG 1860 
GACGACATCT GGAACAACAT GACCTGGATG CAGTGGGAGC GCGAGATCGA TAACTACACC 1920 
AGCCTGATCT ACAGCCTGCT GGAGAAGAGC CAGACCCAGC AGGAGAAGAA CGAGCAGGAG 1980 
CTGCTGGAGC TGGACAAGTG GGCGAGCCTG TGGAACTGGT TCGACATCAC CAACTGGCTG 2040 
TGGTACATCA AAATCTTCAT CATGATTGTG GGCGGCCTGG TGGGCCTCCG CATCGTGTTC 2100 
GCCGTGCTGA GCATCGTGAA CCGCGTGCGC CAGGGCTACA GCCCCCTGAG CCTCCAGACC 2160 
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CGGCCCCCCG TGCCGCGCGG GCCCGACCGC CCCGAGGGCA TCGAGGAGGA GGGCGGCGAG 2220 
CGCGACCGCG ACACCAGCGG CAGGCTCGTG CACGGCTTCC TGGCGATCAT CTGGGTCGAC 2280 
CTCCGCAGCC TGTTCCTGTT CAGCTACCAC CACCGCGACC TGCTGCTGAT CGCCGCCCGC 2340 
ATCGTGGAAC TCCTAGGCCG CCGCGGCTGG GAGGTGCTGA AGTACTGGTG GAACCTCCTC 2400 
CAGTATTGGA GCCAGGAGCT GAAGTCCAGC GCCGTGAGCC TGCTGAACGC CACCGCCATC 2460 
GCCGTGGCCG AGGGCACCGA CCGCGTGATC GAGGTGCTCC AGAGGGCCGG GAGGGCGATC 2520 
CTGCACATCC CCACCCGCAT CCGCCAGGGG CTCGAGAGGG CGCTGCTGTA A 2571 



SEQ. I.D. NO. 1 1 - Complete Sequence of pH4DOZENEGS 

CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG CGCAGCGTGA 60 
CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC TTTCTTCCCT TCCTTTCTCG 120 
CCACGTTCGC CGGCTTTCCC CGTCAAGCTC TAAATCGGGG GCTCCCTTTA GGGTTCCGAT 180 
TTAGTGCTTT ACGGCACCTC GACCCCAAAA AACTTGATTA GGGTGATGGT TCACGTAGTG 240 
GGCCATCGCC CTGATAGACG GTTTTTCGCC CTTTGACGTT GGAGTCCACG TTCTTTAATA 300 
GTGGACTCTT GTTCCAAACT GGAACAACAC TCAACCCTAT CTCGGTCTAT TCTTTTGATT 360 
TATAAGGGAT TTTGCCGATT TCGGCCTATT GGTTAAAAAA TGAGCTGATT TAACAAAAAT 420 
TTAACGCGAA TTTTAACAAA ATATTAACGC TTACAATTTC CATTCGCCAT TCAGGCTGCG 480 
CAACTGTTGG GAAGGGCGAT CGGTGCGGGC CTCTTCGCTA TTACGCCAGC TGGCGAAAGG 540 
GGGATGTGCT GCAAGGCGAT TAAGTTGGGT AACGCCAGGG TTTTCCCAGT CACGACGTTG 600 
TAAAACGACG GCCAGTGAGC GCGCGTAATA CGACTCACTA TAGGGCGAAT TGGAGCTCCA 660 
CCGCGGTGGC GGCCGCTCTA GAGTCCGTTA CATAACTTAC GGTAAATGGC CCGCCTGGCT 720 
GACCGCCCAA CGACCCCCGC CCATTGACGT CAATAATGAC GTATGTTCCC ATAGTAACGC 780 
CAATAGGGAC TTTCCATTGA CGTCAATGGG TGGAGTATTT ACGGTAAACT GCCCACTTGG 840 
CAGTACATCA AGTGTATCAT ATGCCAAGTA CGCCCCCTAT TGACGTCAAT GACGGTAAAT 900 
GGCCCGCCTG GCATTATGCC CAGTACATGA CCTTATGGGA CTTTCCTACT TGGCAGTACA 960 
TCTACGTATT AGTCATCGCT ATTACCATGG TGATGCGGTT TTGGCAGTAC ATCAATGGGC 1020 
GTGGATAGCG GTTTGACTCA CGGGGATTTC CAAGTCTCCA CCCCATTGAC GTCAATGGGA 1080 
GTTTGTTTTG GCACCAAAAT CAACGGGACT TTCCAAAATG TCGTAACAAC TCCGCCCCAT 1140 
TGACGCAAAT GGGCGGTAGG CGTGTACGGT GGGAGGTCTA TATAAGCAGA GCTCGTTTAG 1200 
TGAACCGGTC TCTCTGGTTA GACCAGATCT GAGCCTGGGA GCTCTCTGGC TAACTAGGGA 1260 
ACCCACTGCT TAAGCCTCAA TAAAGCTTGC CTTGAGTGCT TCAAGTAGTG TGTGCCCGTC 1320 
TGTTGTGTGA CTCTGGTAAC TAGAGATCCC TCAGACCCTT TTAGTCAGTG TGGAAAATCT 1380 
CTAGCAGTGG CGCCCGAACA GGGACTTGAA AGCGAAAGGG AAACCAGAGG AGCTCTCTCG 1440 
ACGCAGGACT CGGCTTGCTG AAGCGCGCAC GGCAAGAGGC GAGGGGCGGC GACTGGTGAG 1500 
TACGCCAAAA ATTTTGACTA GCGGAGGCTA GAAGGAGAGA GATGGGTGCG AGAGCGTCAG 1560 
TATTAAGCGG GGGAGAATTA GATCGCGATG GGAAAAAATT CGGTTAAGGC CAGGGGGAAA 1620 
GAAAAAATAT AAATTAAAAC ATATAGTATG GGCAAGCAGG GAGCTAGAAC GATTCGCAGT 1680 
TAATCCTGGC CTGTTAGAAA CATCAGAAGG CTGTAGACAA ATACTGGGAC AGCTACAACC 1740 
ATCCCTTCAG ACAGGATCAG AAGAACTTAG ATCATTATAT AATACAGTAG CAACCCTCTA 1800 
TTGTGTGCAT CAAAGGTTGA GATAAAAGAC ACCAAGGAAG CTTTAGACAA GATAGAGGGA 1860 
GAGCAAAACA AAAGTAAGAA AAAAGCACAG CAAGCAGCAG CTGACACAGG ACACAGCAAT 1920 
CAGGTCAGCC AAAATTACCC TATAGTGCAG AACATCCAGG GGCAAATGGT ACATCAGGCC 1980 
ATATCACCTA GAACTTTAAA TGCATGGGTA AAAGTAGTAG AAGAGAAGGC TTTCAGCCCA 2040 
GAAGTGATAC CCATGTTTTC AGCATTATCA GAAGGAGCCA CCCCACAAGA TTTAAACACC 2100 
ATGCTAAACA CAGTGGGGGG ACATCAAGCA GCCATGCAAA TGTTAAAAGA GACCATCAAT 2160 
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GAGGAAGCTG CAGGAATTCG CCTAAAACTG 
GCTTTCATTG CCAAGTTTGT TTCATAACAA 
AGCGGAGACA GCGACGAAGA GCTCATCAGA 
AGCAGTAAGT AGTACATGTA ACGCAACCTA 
TAGCAATAAT AATAGCAATA GTTGTGTGGT 
TAAGACAAAG AAAAATAGAC AGGTTAATTG 
GCAATGAGAG TGAAGGAGAA ATATCAGCAC 
ATGCTCCTTG GGATGTTGAT GATCTGTAGT 
TATGGGGTAC CTGTGTGGAA GGAAGCAACC 
GCATAGATCT TCAGACTTGG AGGAGGAGAT 
AAATATAAAG TAGTAAAAAT TGAACCATTA 
GTGGTGCAGA GAGAAAAAAG AGCAGTGGGA 
GCAGCAGGAA GCACTATGGG CGCAGCGTCA 
TTGTCTGGTA TAGTGCAGCA GCAGAACAAT 
CTGTTGCAAC TCACAGTCTG GGGCATCAAG 
AGATACCTAA AGGATCAACA GCTCCTGGGG 
ACCACTGCTG TGCCTTGGAA TGCTAGTTGG 
CACACGACCT GGATGGAGTG GGACAGAGAA 
TTAATTGAAG AATCGCAAAA CCAGCAAGAA 
AAATGGGCAA GTTTGTGGAA TTGGTTTAAC 
TTCATAATGA TAGTAGGAGG CTTGGTAGGT 
GTGAATAGAG TTAGGCAGGG ATATTCACCA 
AGGGGACCCG ACAGGCCCGA AGGAATAGAA 
TCCATTCGAT TAGTGAACGG ATCCTTGGCA 
CTCTTCAGCT ACCACCGCTT GAGAGACTTA 
CTGGGACGCA GGGGGTGGGA AGCCCTCAAA 
CAGGAACTAA AGAATAGTGC TGTTAGCTTG 
GGGACAGATA GGGTTATAGA AGTAGTACAA 
AGAAGAATAA GACAGGGCTT GGAAAGGATT 
AAGTAGTGTG ATTGGATGGC CTACTGTAAG 
AGATAGGGTG GGAGCAGCAT CTCGACGCTG 
GTCGAGGCGG ATCCGGCCAT TAGCCATATT 
TGGCTATTGG CCATTGCATA CGTTGTATCC 
ATGTCCAACA TTACCGCCAT GTTGACATTG 
TACGGGGTCA TTAGTTCATA GCCCATATAT 
TGGCCCGCCT GGCTGACCGC CCAACGACCC 
TCCCATAGTA ACGCCAATAG GGACTTTCCA 
AACTGCCCAC TTGGCAGTAC ATCAAGTGTA 
CAATGACGGT AAATGGCCCG CCTGGCATTA 
TACTTGGCAG TACATCTACG TATTAGTCAT 
GTACATCAAT GGGCGTGGAT AGCGGTTTGA 
TGACGTCAAT GGGAGTTTGT TTTGGCACCA 
CAACTCCGCC CCATTGACGC AAATGGGCGG 
CAGAGCTCGT TTAGTGAACC GTCAGATCGC 
CCATAGAAGA CACCGGGACC GATCCAGCCT 
CGGGGATGAC GTCATCGACT TCGAAGGTTC 
ACGTCATCGA CTTCGAAGGT TCGAATCCTT 
GACTTCGAAG GTTCGAATCC TTCTAGATTC 



CTTGTACCAA TTGCTATTGT AAAAAGTGTT 2220 
AAGCCTTAGG CATCTCCTAT GGCAGGAAGA 2280 
ACAGTCAGAC TCATCAAGCT TCTCTATCAA 2340 
TACCAATAGT AGCAATAGTA GCATTAGTAG 2400 
CCATAGTAAT CATAGAATAT AGGAAAATAT 2460 
ATAGACTAAT AGAAAGAGCA GAAGACAGTG 2520 
TTGTGGAGAT GGGGGTGGAG ATGGGGCACC 2580 
GCTACAGAAA AATTGTGGGT CACAGTCTAT 2640 
ACCACTCTAT TTTGTGCATC AGATGCTAAA 2700 
ATGAGGGACA ATTGGAGAAG TGAATTATAT 2760 
GGAGTAGCAC CCACCAAGGC AAAGAGAAGA 2820 
ATAGGAGCTT TGTTCCTTGG GTTCTTGGGA 2880 
ATGACGCTGA CGGTACAGGC CAGACAATTA 2940 
TTGCTGAGGG CTATTGAGGC GCAACAGCAT 3000 
CAGCTCCAGG CAAGAATCCT GGCTGTGGAA 3060 
ATTTGGGGTT GCTCTGGAAA ACTCATTTGC 3120 
AGTAATAAAT CTCTGGAACA GATCTGGAAT 3180 
ATTAACAATT ACACAAGCTT AATACACTCC 3240 
AAGAATGAAC AAGAATTATT GGAATTAGAT 3300 
ATAACAAATT GGCTGTGGTA TATAAAATTA 3360 
TTAAGAATAG TTTTTGCTGT ACTTTCTATA 3420 
TTATCGTTTC AGACCCACCT CCCAACCCCG 3480 
GAAGAAGGTG GAGAGAGAGA CAGAGACAGA 3540 
CTTATCTGGG ACGATCTGCG GAGCCTGTGC 3600 
CTCTTGATTG TAACGAGGAT TGTGGAACTT 3660 
TATTGGTGGA ATCTCCTACA GTATTGGAGT 3720 
CTCAATGCCA CAGCCATAGC AGTAGCTGAG 3780 
GGAGCTTGTA GAGCTATTCG CCACATACCT 3840 
TTGCTATAAG ATGGGTGGCA AGTGGTCAAA 3900 
GGAAAGAATG AGACGAGCTG AGCCAGCAGC 3960 
CAGGAGTGGG GAGGCACGAT GGCCGCTTTG 4020 
ATTCATTGGT TATATAGCAT AAATCAATAT 4080 
ATATCATAAT ATGTACATTT ATATTGGCTC 4140 
ATTATTGACT AGTTATTAAT AGTAATCAAT 4200 
GGAGTTCCGC GTTACATAAC TTACGGTAAA 4260 
CCGCCCATTG ACGTCAATAA TGACGTATGT 4320 
TTGACGTCAA TGGGTGGAGT ATTTACGGTA 4380 
TCATATGCCA AGTACGCCCC CTATTGACGT 4440 
TGCCCAGTAC ATGACCTTAT GGGACTTTCC 4500 
CGCTATTACC ATGGTGATGC GGTTTTGGCA 4560 
CTCACGGGGA TTTCCAAGTC TCCACCCCAT 4620 
AAATCAACGG GACTTTCCAA AATGTCGTAA 4680 
TAGGCATGTA CGGTGGGAGG TCTATATAAG 4740 
CTGGAGACGC CATCCACGCT GTTTTGACCT 4800 
CCGCGGCCCC AAGCTTCAGC TGCTCGAGCC 4860 
GAATCCTTCT ACTGCCACCA I I I I I ICTCT 4920 
CGCTGTCCAC CAGTCGAGTA TTACGTCATC 4980 
ACCAI Mill AGGAACGTCA TCGACTTCGA 5040 
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AGGTTCGAAT CCTTCCAGTT CCACCAGTCG 
ATCCTTCTCT TCCCACCATT I 1 1 I I ICCAC 
GGGCCCACCA GTCGAGGGCT ACGTCATCGA 
CATTTTTTCT GAACGTCATC GACTTCGAAG 
TATAACGTCA TCGACTTCGA AGGTTCGAAT 
CATCGACTTC GAAGGTTCGA ATCCTTCTTC 
TCGAAGGTTC GAATCCTTCG TAGTTCACCA 
TCGAATCCTT CTAGGCCCAC CAGTCGACGC 
GACCTAGAAA AACATGGAGC AATCACAAGT 
GCCTGGCTAG AAGCACAAGA GGAGGAGGAG 
TTAAGACCAA TGACTTACAA GGCAGCTGTA 
GGACTGGAAG GGCTAATTCA CTCCCAACGA 
CACACACAAG GCTACTTCCC TGATTGGCAG 
CCACTGACCT TTGGATGGTG CTACAAGCTA 
GCCAATGAAG GAGAGAACAC CCGCTTGTTA 
CCGGAGAGAG AAGTATTAGA GTGGAGGTTT 
CGAGAGCTGC ATCCGGAGTA CTTCAAGAAC 
TCCGCTGGGG ACTTTCCAGG GAGGCGTGGC 
AGATGCTGCA TATAAGCAGC TGCTTTTTGC 
CTGAGCCTGG GAGCTCTCTG GCTAACTAGG 
GCCTTGAGTG CTTCAAGTAG TGTGTGCCCG 
CCTCAGACCC TTTTAGTCAG TGTGGAAAAT 
AGCTTTTGTT CCCTTTAGTG AGGGTTAATT 
TTTCCTGTGT GAAATTGTTA TCCGCTCACA 
AAGTGTAAAG CCTGGGGTGC CTAATGAGTG 
CTGCCCGCTT TCCAGTCGGG AAACCTGTCG 
GCGGGGAGAG GCGGTTTGCG TATTGGGCGC 
CGCTCGGTCG TTCGGCTGCG GCGAGCGGTA 
TCCACAGAAT CAGGGGATAA CGCAGGAAAG 
AGGAACCGTA AAAAGGCCGC GTTGCTGGCG 
CATCACAAAA ATCGACGCTC AAGTCAGAGG 
CAGGCGTTTC CCCCTGGAAG CTCCCTCGTG 
GGATACCTGT CCGCCTTTCT CCCTTCGGGA 
AGGTATCTCA GTTCGGTGTA GGTCGTTCGC 
GTTCAGCCCG ACCGCTGCGC CTTATCCGGT 
CACGACTTAT CGCCACTGGC AGCAGCCACT 
GGCGGTGCTA CAGAGTTCTT GAAGTGGTGG 
TTTGGTATCT GCGCTCTGCT GAAGCCAGTT 
TCCGGCAAAC AAACCACCGC TGGTAGCGGT 
CGCAGAAAAA AAGGATCTCA AGAAGATCCT 
TGGAACGAAA ACTCACGTTA AGGGATTTTG 
TAGATCCTTT TAAATTAAAA ATGAAGTTTT 
TGGTCTGACA GTTACCAATG CTTAATCAGT 
CGTTCATCCA TAGTTGCCTG ACTCCCCGTC 
CCATCTGGCC CCAGTGCTGC AATGATACCG 
TCAGCAATAA ACCAGCCAGC CGGAAGGGCC 
GCCTCCATCC AGTCTATTAA TTGTTGCCGG 
AGTTTGCGCA ACGTTGTTGC CATTGCTACA 



AGGCCAACGT CATCGACTTC GAAGGTTCGA 5100 

GTCATCGACT TCGAAGGTTC GAATCCTTCG 5160 

CTTCGAAGGT TCGAATCCTT CTTGCTTCAC 5220 

GTTCGAATCC TTCTGCTGTC ACCAGTCGAG 5280 

CCTTCACCGG TCACCATTTT TTTATAACGT 5340 

TTACACCAGT CGAGGTACAC GTCATCGACT 5400 

I I I I I IGTGC ACGTCATCGA CTTCGAAGGT 5460 

ATGCCTGCAG GTCGAGGTCG ATACCGTCGA 5520 

AGCAATACAG CAGCTACCAA TGCTGATTGT 5580 

GTGGGTTTTC CAGTCACACC TCAGGTACCT 5640 

GATCTTAGCC ACTTTTTAAA AGAAAAGGGG 5700 

AGACAAGATA TCCTTGATCT GTGGATCTAC 5760 

AACTACACAC CAGGGCCAGG GATCAGATAT 5820 

GTACCAGTTG AGCAAGAGAA GGTAGAAGAA 5880 

CACCCTGTGA GCCTGCATGG GATGGATGAC 5940 

GACAGCCGCC TAGCATTTCA TCACATGGCC 6000 

TGCTGACATC GAGCTTGCTA CAAGGGACTT 6060 

CTGGGCGGGA CTGGGGAGTG GCGAGCCCTC 6120 

CTGTACTGGG TCTCTCTGGT TAGACCAGAT 6180 

GAACCCACTG CTTAAGCCTC AATAAAGCTT 6240 

TCTGTTGTGT GACTCTGGTA ACTAGAGATC 6300 

CTCTAGCAGT CGAGGGGGGG CCCGGTACCC 6360 

GCGCGCTTGG CGTAATCATG GTCATAGCTG 6420 

ATTCCACACA ACATACGAGC CGGAAGCATA 6480 

AGCTAACTCA CATTAATTGC GTTGCGCTCA 6540 

TGCCAGCTGC ATTAATGAAT CGGCCAACGC 6600 

TCTTCCGCTT CCTCGCTCAC TGACTCGCTG 6660 

TCAGCTCACT CAAAGGCGGT AATACGGTTA 6720 

AACATGTGAG CAAAAGGCCA GCAAAAGGCC 6780 

TTTTTCCATA GGCTCCGCCC CCCTGACGAG 6840 

TGGCGAAACC CGACAGGACT ATAAAGATAC 6900 

CGCTCTCCTG TTCCGACCCT GCCGCTTACC 6960 

AGCGTGGCGC TTTCTCATAG CTCACGCTGT 7020 

TCCAAGCTGG GCTGTGTGCA CGAACCCCCC 7080 

AACTATCGTC TTGAGTCCAA CCCGGTAAGA 7140 

GGTAACAGGA TTAGCAGAGC GAGGTATGTA 7200 

CCTAACTACG GCTACACTAG AAGGACAGTA 7260 

ACCTTCGGAA AAAGAGTTGG TAGCTCTTGA 7320 

GG I I I I I I I G TTTGCAAGCA GCAGATTACG 7380 

TTGATCTTTT CTACGGGGTC TGACGCTCAG 7440 

GTCATGAGAT TATCAAAAAG GATCTTCACC 7500 

AAATCAATCT AAAGTATATA TGAGTAAACT 7560 

GAGGCACCTA TCTCAGCGAT CTGTCTATTT 7620 

GTGTAGATAA CTACGATACG GGAGGGCTTA 7680 

CGAGACCCAC GCTCACCGGC TCCAGATTTA 7740 

GAGCGCAGAA GTGGTCCTGC AACTTTATCC 7800 

GAAGCTAGAG TAAGTAGTTC GCCAGTTAAT 7860 

GGCATCGTGG TGTCACGCTC GTCGTTTGGT 7920 



ATGGCTTCAT TCAGCTCCGG TTCCCAACGA 
TGCAAAAAAG CGGTTAGCTC CTTCGGTCCT 
GTGTTATCAC TCATGGTTAT GGCAGCACTG 
AGATGCTTTT CTGTGACTGG TGAGTACTCA 
CGACCGAGTT GCTCTTGCCC GGCGTCAATA 
TTAAAAGTGC TCATCATTGG AAAACGTTCT 
CTGTTGAGAT CCAGTTCGAT GTAACCCACT 
ACTTTCACCA GCGTTTCTGG GTGAGCAAAA 
ATAAGGGCGA CACGGAAATG TTGAATACTC 
ATTTATCAGG GTTATTGTCT CATGAGCGGA 
CAAATAGGGG TTCCGCGCAC ATTTCCCCGA 



-39- P006478GB ATM 

TCAAGGCGAG TTACATGATC CCCCATGTTG 7980 
CCGATCGTTG TCAGAAGTAA GTTGGCCGCA 8040 
CATAATTCTC TTACTGTCAT GCCATCCGTA 8100 
ACCAAGTCAT TCTGAGAATA GTGTATGCGG 8160 
CGGGATAATA CCGCGCCACA TAGCAGAACT 8220 
TCGGGGCGAA AACTCTCAAG GATCTTACCG 8280 
CGTGCACCCA ACTGATCTTC AGCATCTTTT 8340 
ACAGGAAGGC AAAATGCCGC AAAAAAGGGA 8400 
ATACTCTTCC TTTTTCAATA TTATTGAAGC 8460 
TACATATTTG AATGTATTTA GAAAAATAAA 8520 
AAAGTGCCAC 8560 
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1 . A viral vector system comprising: 

(i) a first nucleotide sequence encoding an external guide sequence capable of binding 
to and effecting the cleavage by RNase P of a second nucleotide sequence, or transcription 
product thereof, encoding a viral polypeptide required for the assembly of viral particles; 
and 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of viral particles, which third nucleotide sequence has a different nucleotide 
sequence to the second nucleotide sequence such that the third nucleotide sequence, or 
transcription product thereof, is resistant to cleavage directed by the external guide 
sequence. 

2. A system according to claim 1 further comprising at least one further first 
nucleotide sequence encoding a gene product capable of binding to and effecting the 
cleavage, directly or indirectly, of a second nucleotide sequence, or transcription product 
thereof, encoding a viral polypeptide required for the assembly of viral particles, wherein 
the gene product is selected from an external guide sequence, a ribozyme and an anti-sense 
ribonucleic acid. 

3. A viral vector production system comprising: 

(i) a viral genome comprising at least one first nucleotide sequence encoding a gene 
product capable of binding to and effecting the cleavage, directly or indirectly, of a second 
nucleotide sequence, or transcription product thereof, encoding a viral polypeptide required 
for the assembly of viral particles; 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third 
nucleotide sequence, or transcription product thereof, is resistant to cleavage directed by 
said gene product; 

wherein at least one of the gene products is an external guide, sequence capable of binding 
to and effecting the cleavage by RNase P of the second nucleotide sequence. 
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4. A system according to claim 3 wherein in addition to an external guide sequence, at 
least one gene product is selected from a ribozyme and an anti-sense ribonucleic acid. 

5. A system according to any one of claims 1 to 4 wherein the viral vector is a 
retroviral vector. 

6. A system according to claim 5 wherein the retroviral vector is a lentiviral vector. 

7. A system according to claim 6 wherein the lentiviral vector is an HIV vector. 

8. A system according to any one of claims 5 to 7 wherein the polypeptide required 
for the assembly of viral particles is selected from gag, pol and env proteins. 

9. A system according to claim 8 wherein at least the gag and pol proteins are from a 
lentivirus. 

10. A system according to claim 7 wherein the env protein is from a lentivirus. 

11. A system according to claim 9 or 1 0 wherein the lentivirus is HIV. 

12. A system according to any one of the preceding claims wherein the third nucleotide 
sequence is resistant to cleavage directed by the gene product as a result of one or more 
conservative alterations in the nucleotide sequence which remove cleavage sites recognised 
by the at least one gene product and/or binding sites for the at least one gene product 

13. A system according to any one of claims 1 to 11 wherein the third nucleotide 
sequence is adapted to be resistant to cleavage by the at least one gene product. 

14. A system according to any one of the preceding claims wherein the third nucleotide 
sequence is codon optimised for expression in producer cells. 
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15. A system according to claim 14, wherein the producer cells are mammalian cells. 



1 6. A system according to any one of the preceding claims comprising a plurality of 
first nucleotide sequences and third nucleotide sequences as defined therein. 

17. A viral particle comprising a viral vector genome as defined in any one of claims 3 
to 16 and one or more third nucleotide sequences as defined in any of claims 3 to 16. 

18. A viral particle produced using a viral vector production system according to any 
one of claims 3 to 16. 

19. A method for producing a viral particle which method comprises introducing into a 
host cell (i) a viral genome as defined in any one of claims 3 to 16 (ii) one or more third 
nucleotide sequences as defined in any of claims 3 to 16 and (iii) nucleotide sequences 
encoding the other essential viral packaging components not encoded by the one or more 
third nucleotide sequences. 

20. A viral particle produced by the method of claim 19. 

21. A pharmaceutical composition comprising a viral particle according to claims 17, 
1 8 or 20 together with a pharmaceutically acceptable carrier or diluent. 

22. A viral system according to any one of claims 1 to 17 or a viral particle according 
to claims 17, 1 8 or 20 in treating a viral infection. 

23. A viral system according to any one of claims 1 to 17 for use in a method of 
producing viral particles. 
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ANTI-VIRAL VECTORS 

A viral vector production system is provided which system comprises: 

(i) a viral genome comprising at least one first nucleotide sequence encoding a gene 
product capable of binding to and effecting the cleavage, directly or indirectly, of a second 
nucleotide sequence, or transcription product thereof, encoding a viral polypeptide required 
for the assembly of viral particles; 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third 
nucleotide sequence, or transcription product thereof, is resistant to cleavage directed by 
said gene product; 

wherein at least one of the gene products is an external guide sequence capable of binding 
to and effecting the cleavage by RNase P of the second nucleotide sequence. 

The viral vector production system may be used to produce viral particles for use in 
treating or preventing viral infection. 
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Figure 3 



gagpol-HXB2 -> Codon Usage 

DNA sequence 43 0 8 b.p. ATGGGTGCGAGA ... GATGAGGATTAG 

143 6 codons 



MW : 161929 Dalton CAI(S.c) : 0.083 CAI(E.c) : 0.151 



TTT 


phe 


F 


21 


TCT 


ser 


S 


3 


TAT 


tyr 


Y 


30 


TGT 


cys 


C 


18 


TTC 


phe 


F 


14 


TCC 


ser 


S 


3 


TAC 


tyr 


Y 


9 


TGC 


cys 


C 


2 


TTA 


leu 


L 


46 


TCA 


ser 


s 


19 


TAA 


OCK 


Z 




TGA 


OPA 


z 




TTG 


leu 


L 


11 


TCG 


ser 


s 


1 


TAG 


AMB 


2 


1 


TGG 


trp 


w- 


37 


CTT 


leu 


L 


13 


CCT 


pro 


p 


21 


CAT 


his 


H 


20 


CGT 


arg 


R 




CTC 


leu 


L 


7 


ccc 


pro 


p 


14 


CAC 


his 


H 


7 


CGC 


arg 


R 




CTA 


leu 


L 


17 


CCA 


pro 


p 


41 


CAA 


gin 


Q 


56 


CGA 


arg 


R 


3 


CTG 


leu 


L 


16 


CCG 


pro 


p 




CAG 


gin 


Q 


39 


CGG 


arg 


R 


3 


ATT 


ile 


I 


30 


ACT 


thr 


T 


24 


AAT 


asn 


N 


42 


AGT 


ser 


S 


18 


ATC 


.ile 


I 


14 


ACC 


thr 


T 


20 


AAC 


asn 


N 


16 


AGC 


ser 


S 


16 


ATA 


ile 


I 


56 


AC A 


thr 


T 


43 


AAA 


lys 


K 


33 


AGA 


arg 


R 


45 


ATG 


met 


M 


2 9 


ACG 


thr 


T 


1 


AAG 


lys 


K 


34 


AGG 


arg 


R 


18 


GTT 


val 


V 


15 


GCT 


ala 


A 


17 


GAT 


asp 


0 


37 


GGT 


gly 


G 


11 


GTC 


val 


V 


11 


GCC 


ala 


A 


19 


GAC 


asp 


D 


26 


GGC 


gly G 


10 


GTA 


val 


V 


55 


GCA 


ala 


A 


55 


GAA 


glu 


E 


75 


GGA 


gly G 


61 


GTG 


val 


V 


15 


GCG 


ala 


A 


5 


GAG 


glu 


E 


32 


GGG 


gly G 


26 



3/14 



■"ntuffa' 



Figure 



gagpol-SYNgp [1 to 4303] -> Codon CJsage 

DMA sequence 43 0 3 b.p. ATGGGCGCCCGC ... GATGAGGATTAG linear 
1436 codons 



MW : 161929 Dalton CAI(S.c) : 0.030 CAI(E.c) : 0.296 



TTT 


phe 


F 


5 


TCT 


ser 


S 


5 


TAT 


tyr 


Y 


10 


TGT 


cys 


c 


6 


TTC 


phe 


F 


30 


TCC 


ser 


s 


11 


TAC 


tyr 


Y 


29 


TGC 


cys 


c 


14 


TTA 


leu 


L 


2 


TCA 


ser 


s 


4 


TAA 


OCH 


Z 




TGA 


OPA 


z 




TTG 


leu 


L 


7 


TCG 


ser 


s 


6 


TAG 


AMB 


Z 


1 


TGG 


trp 


w 


37 


CTT 


leu 


L 


3 


CCT 


pro 


p 


14 


CAT 


his 


K 


6 


CGT 


arg 


R 


2 


CTC 


leu 


L 


22 


CCC 


pro 


p 


39 


CAC 


his 


K 


21 


CGC 


arg 


R 


34 


CTA 


leu 


L 


6 


CCA 


pro 


p 


10 


CAA 


gin 


Q 


14 


CGA 


arg 


R 


3 


CTG 


leu 


L 


7 o 


CCG 


pro 


p 


13 


CAG 


gin 


Q 


81 


CGG 


arg 


R 


10 


ATT 


ile 


I 


17 


ACT 


thr 


T 


11 


AAT 


asn 


N 


13 


AGT 


ser 


S 


7 


ATC 


ile 


I 


79 


ACC 


thr 


T 


48 


AAC 


asn 


N 


45 


AGC 


ser 


S 


27 


ATA 


ile 


I 


4 


ACA 


thr 


T 


13 


AAA 


lys 


K 


25 


AGA 


arg 


R 


7 


ATG 


met 


M 


29 


ACG 


thr 


T 


16 


AAG 


lys 


K 


97 


AGG 


arg 


R 


13 


GTT 


val 


V 


5 


GCT 


ala 


A 


15 


GAT 


asp 


D 


19 


GGT 


gly 


G 


10 


GTC 


val 


V 


27 


GCC 


ala 


A 


56 


GAC 


asp 


D 


44 


GGC 


gly 


G 


54 


GTA 


val 


V 


6 


GCA 


ala 


A 


13 


GAA 


glu 


E 


29 


GGA 


gly 


G 


16 


GTG 


val 


V 


58 


GCG 


ala 


A 


12 


GAG 


glu 


E 


73 


GGG 


gly 


G 


28 



4/14 



Figure 5 



env-mn [1 to 2 571] -> Codon Usage 
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Figure 9 B 



Generic design of EGSs to target any RNA. 
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